Signed-off-by: Matt Bruce <mbrucedogs@gmail.com>

2025-08-05 12:59:09 -05:00
61 changed files with 180049 additions and 444110 deletions
--- a/.gitignore
+++ b/.gitignore
@ -14,6 +14,9 @@ logs/
 *.log
 # Tracking and cache files
 karaoke_tracking.json
 karaoke_tracking.json.backup
 songlist_tracking.json
 *.cache
 # yt-dlp temporary files
--- a/PRD.md
+++ b/PRD.md
@ -1,8 +1,8 @@
-# 🎤 Karaoke Video Downloader – PRD (v3.4.4)
+# 🎤 Karaoke Video Downloader – PRD (v3.5)
 ## ✅ Overview
-A Python-based cross-platform CLI tool to download karaoke videos from YouTube channels/playlists using `yt-dlp`, with advanced tracking, songlist prioritization, and flexible configuration. Supports Windows and macOS with automatic platform detection. The codebase has been comprehensively refactored into a modular architecture with centralized utilities for improved maintainability, error handling, and code reuse.
+A Python-based cross-platform CLI tool to download karaoke videos from YouTube channels/playlists using `yt-dlp`, with advanced tracking, songlist prioritization, and flexible configuration. Supports Windows, macOS, and Linux with automatic platform detection and optimized caching. The codebase has been comprehensively refactored into a modular architecture with centralized utilities for improved maintainability, error handling, and code reuse.
 ---
@ -63,7 +63,7 @@ The codebase has been refactored into focused modules with centralized utilities
 ---
 ## ⚙️ Platform & Stack
- **Platform:** Windows, macOS
+- **Platform:** Windows, macOS, Linux
 - **Interface:** Command-line (CLI)
 - **Tech Stack:** Python 3.7+, yt-dlp (platform-specific binary), mutagen (for ID3 tagging)
@ -101,7 +101,6 @@ python download_karaoke.py --clear-cache SingKingKaraoke
 - ✅ Songlist integration: prioritize and track custom songlists
 - ✅ Songlist-only mode: download only songs from the songlist
 - ✅ Songlist focus mode: download only songs from specific playlists by title
 - ✅ Force download mode: bypass all existing file checks and re-download songs regardless of server duplicates or existing files
 - ✅ Global songlist tracking to avoid duplicates across channels
 - ✅ ID3 tagging for artist/title in MP4 files (mutagen)
 - ✅ Real-time progress and detailed logging
@ -123,8 +122,6 @@ python download_karaoke.py --clear-cache SingKingKaraoke
 - ✅ **Centralized file operations**: Single source of truth for filename sanitization, file validation, and path operations
 - ✅ **Centralized song validation**: Unified logic for checking if songs should be downloaded across all modules
 - ✅ **Enhanced configuration management**: Structured configuration with dataclasses, type safety, and validation
 - ✅ **Manual video collection**: Static video collection system for managing individual karaoke videos that don't belong to regular channels. Use `--manual` to download from `data/manual_videos.json`.
 - ✅ **Channel-specific parsing rules**: JSON-based configuration for parsing video titles from different YouTube channels, with support for various title formats and cleanup rules.
 ---
@ -152,34 +149,21 @@ KaroakeVideoDownloader/
 │   ├── check_resolution.py     # Resolution checker utility
 │   ├── resolution_cli.py       # Resolution config CLI
 │   └── tracking_cli.py         # Tracking management CLI
-├── config/                   # Configuration files
+├── data/                      # All config, tracking, cache, and songlist files
-│   └── config.json          # Main configuration file
+│   ├── config.json
 ├── data/                     # All tracking, cache, and songlist files
 │   ├── karaoke_tracking.json
 │   ├── songlist_tracking.json
 │   ├── channel_cache.json
-│   ├── channels.json          # Channel configuration with parsing rules
+│   ├── channels.txt
 │   ├── manual_videos.json     # Manual video collection
 │   └── songList.json
 ├── utilities/                # Utility scripts and tools
 │   ├── add_manual_video.py  # Manual video management
 │   ├── build_cache_from_raw.py # Cache building utility
 │   ├── cleanup_duplicate_files.py # File cleanup utilities
 │   ├── cleanup_recent_tracking.py # Tracking cleanup utilities
 │   ├── deduplicate_songlist_tracking.py # Data deduplication
 │   ├── fix_artist_name_format.py # Data cleanup utilities
 │   ├── fix_artist_name_format_simple.py
 │   ├── fix_code_quality.py  # Development tools
 │   ├── reset_and_redownload.py # Maintenance utilities
 │   └── songlist_report.py   # Reporting utilities
 ├── downloads/                 # All video output
 │   └── [ChannelName]/         # Per-channel folders
 ├── logs/                      # Download logs
 ├── downloader/yt-dlp.exe      # yt-dlp binary (Windows)
 ├── downloader/yt-dlp_macos    # yt-dlp binary (macOS)
-├── src/tests/                 # Test scripts
+├── downloader/yt-dlp          # yt-dlp binary (Linux)
-│   ├── test_macos.py         # macOS setup and functionality tests
+├── tests/                     # Diagnostic and test scripts
-│   └── test_platform.py      # Platform detection tests
+│   └── test_installation.py
 ├── download_karaoke.py        # Main entry point (thin wrapper)
 ├── README.md
 ├── PRD.md
@ -194,8 +178,6 @@ KaroakeVideoDownloader/
 - `--songlist-priority`: Prioritize songlist songs in download queue
 - `--songlist-only`: Download only songs from the songlist
 - `--songlist-focus <PLAYLIST_TITLE1> <PLAYLIST_TITLE2>...`: Focus on specific playlists by title (e.g., `--songlist-focus "2025 - Apple Top 50" "2024 - Billboard Hot 100"`)
 - `--songlist-file <FILE_PATH>`: Custom songlist file path to use with --songlist-focus (default: data/songList.json)
 - `--force`: **Force download from channels, bypassing all existing file checks and re-downloading if necessary**
 - `--songlist-status`: Show songlist download progress
 - `--limit <N>`: Limit number of downloads (enables fast mode with early exit)
 - `--resolution <720p|1080p|...>`: Override resolution
@ -208,11 +190,7 @@ KaroakeVideoDownloader/
 - `--fuzzy-match`: **Enable fuzzy matching for songlist-to-video matching (uses rapidfuzz if available)**
 - `--fuzzy-threshold <N>`: **Fuzzy match threshold (0-100, default 85)**
 - `--parallel`: **Enable parallel downloads for improved speed**
- `--workers <N>`: **Number of parallel download workers (1-10, default: 3, only used with --parallel)**
+- `--workers <N>`: **Number of parallel download workers (1-10, default: 3)**
 - `--manual`: **Download from manual videos collection (data/manual_videos.json)**
 - `--channel-focus <CHANNEL_NAME>`: **Download from a specific channel by name (e.g., 'SingKingKaraoke')**
 - `--all-videos`: **Download all videos from channel (not just songlist matches), skipping existing files and songs in songs.json**
 - `--dry-run`: **Build download plan and show what would be downloaded without actually downloading anything**
 ---
@ -223,8 +201,6 @@ KaroakeVideoDownloader/
 - **ID3 Tagging:** Artist/title extracted from video title and embedded in MP4 files.
 - **Cleanup:** Extra files from yt-dlp (e.g., `.info.json`) are automatically removed after download.
 - **Reset/Clear:** Use `--reset-channel` to reset all tracking and files for a channel (optionally including songlist songs with `--reset-songlist`). Use `--clear-cache` to clear cached video lists for a channel or all channels.
 - **Channel-Specific Parsing:** Uses `data/channels.json` to define parsing rules for each YouTube channel, handling different video title formats (e.g., "Artist - Title", "Artist   Title", "Title | Artist", etc.).
 - **Manual Video Collection:** Static video management system using `data/manual_videos.json` for individual karaoke videos that don't belong to regular channels. Accessible via `--manual` parameter.
 ## 🔧 Refactoring Improvements (v3.3)
 The codebase has been comprehensively refactored to improve maintainability and reduce code duplication. Recent improvements have enhanced reliability, performance, and code organization:
@ -278,7 +254,7 @@ The codebase has been comprehensively refactored to improve maintainability and
 ### **New Parallel Download System (v3.4)**
 - **Parallel downloader module:** `parallel_downloader.py` provides thread-safe concurrent download management
- **Configurable concurrency:** Use `--parallel` to enable parallel downloads with 3 workers by default, or `--parallel --workers N` for custom worker count (1-10)
+- **Configurable concurrency:** Use `--parallel --workers N` to enable parallel downloads with N workers (1-10)
 - **Thread-safe operations:** All tracking, caching, and progress operations are thread-safe
 - **Real-time progress tracking:** Shows active downloads, completion status, and overall progress
 - **Automatic retry mechanism:** Failed downloads are automatically retried with reduced concurrency
@ -286,245 +262,16 @@ The codebase has been comprehensively refactored to improve maintainability and
 - **Performance improvements:** Significantly faster downloads for large batches (3-5x speedup with 3-5 workers)
 - **Integrated with all modes:** Works with both songlist-across-channels and latest-per-channel download modes
---
+### **Cross-Platform Support (v3.5)**
-
+- **Platform detection:** Automatic detection of Windows, macOS, and Linux systems
-## 🚀 Future Enhancements
+- **Flexible yt-dlp integration:** Supports both binary files and pip-installed yt-dlp modules
- [ ] Web UI for easier management
+- **Platform-specific configuration:** Automatic selection of appropriate yt-dlp binary/command for each platform
- [ ] More advanced song matching (multi-language)
+- **Setup automation:** `setup_platform.py` script for easy platform-specific setup
- [ ] Download scheduling and retry logic
+- **Command parsing:** Intelligent parsing of yt-dlp commands (file paths vs. module commands)
- [ ] More granular status reporting
+- **Enhanced documentation:** Platform-specific setup instructions and troubleshooting
- [x] **Parallel downloads for improved speed** ✅ **COMPLETED**
+- **Backward compatibility:** Maintains full compatibility with existing Windows installations
- [x] **Enhanced fuzzy matching with improved video title parsing** ✅ **COMPLETED**
+- **FFmpeg integration:** Automatic FFmpeg installation and configuration for optimal video processing
- [x] **Consolidated extract_artist_title function** ✅ **COMPLETED**
+- **Optimized caching:** Enhanced channel video caching with format compatibility and instant video list loading
 - [x] **Duplicate file prevention and filename consistency** ✅ **COMPLETED**
 - [ ] Unit tests for all modules
 - [ ] Integration tests for end-to-end workflows
 - [ ] Plugin system for custom file operations
 - [ ] Advanced configuration UI
 - [ ] Real-time download progress visualization
 ## 🔧 Recent Bug Fixes & Improvements (v3.4.1)
 ### **Enhanced Fuzzy Matching (v3.4.1)**
 - **Improved `extract_artist_title` function**: Enhanced to handle multiple video title formats beyond simple "Artist - Title" patterns
  - **"Title Karaoke | Artist Karaoke Version" format**: Correctly parses titles like "Hold On Loosely Karaoke | 38 Special Karaoke Version"
  - **"Title Artist KARAOKE" format**: Handles titles ending with "KARAOKE" and attempts to extract artist information
  - **Fallback handling**: Returns empty artist and full title for unparseable formats
 - **Consolidated function usage**: Removed duplicate `extract_artist_title` implementations across modules
  - **Single source of truth**: All modules now import from `fuzzy_matcher.py`
  - **Consistent parsing**: Eliminated inconsistencies between different parsing implementations
  - **Better maintainability**: Changes to parsing logic only need to be made in one place
 ### **Fixed Import Conflicts**
 - **Resolved import conflict in `download_planner.py`**: Updated to use the enhanced `extract_artist_title` from `fuzzy_matcher.py` instead of the simpler version from `id3_utils.py`
 - **Updated `id3_utils.py`**: Now imports `extract_artist_title` from `fuzzy_matcher.py` for consistency
 ### **Enhanced --limit Parameter**
 - **Fixed limit application**: The `--limit` parameter now correctly applies to the scanning phase, not just the download execution
 - **Improved performance**: When using `--limit N`, only the first N songs are scanned against channels, significantly reducing processing time for large songlists
 ### **Benefits of Recent Improvements**
 - **Better matching accuracy**: Enhanced fuzzy matching can now handle a wider variety of video title formats commonly found on YouTube karaoke channels
 - **Reduced false negatives**: Songs that previously couldn't be matched due to title format differences now have a higher chance of being found
 - **Consistent behavior**: All parts of the system use the same parsing logic, eliminating edge cases where different modules would parse the same title differently
 - **Improved performance**: The `--limit` parameter now works as expected, providing faster processing for targeted downloads
 - **Cleaner codebase**: Eliminated duplicate code and import conflicts, making the system more maintainable
 ## 🔧 Recent Bug Fixes & Improvements (v3.4.2)
 ### **Duplicate File Prevention & Filename Consistency**
 - **Enhanced file existence checking**: `check_file_exists_with_patterns()` now detects files with `(2)`, `(3)`, etc. suffixes that yt-dlp creates
 - **Automatic duplicate prevention**: Download pipeline skips downloads when files already exist (including duplicates)
 - **Updated yt-dlp configuration**: Set `"nooverwrites": false` to prevent yt-dlp from creating duplicate files with suffixes
 - **Cleanup utility**: `data/cleanup_duplicate_files.py` provides interactive cleanup of existing duplicate files
 - **Filename vs ID3 tag consistency**: Removed "(Karaoke Version)" suffix from ID3 tags to match filenames exactly
 - **Unified parsing**: Both filename generation and ID3 tagging use the same artist/title extraction logic
 ### **Benefits of Duplicate Prevention**
 - **No more duplicate files**: Eliminates `(2)`, `(3)` suffix files that waste disk space
 - **Consistent metadata**: Filename and ID3 tag use identical artist/title format
 - **Efficient disk usage**: Prevents unnecessary downloads of existing files
 - **Clear file identification**: Consistent naming across all file operations
 ## 🛠️ Maintenance
 ### **Regular Cleanup**
 - Run the cleanup utility periodically to remove any duplicate files
 - Monitor downloads for any new duplicate creation (should be rare with fixes)
 ### **Configuration**
 - Keep `"nooverwrites": false` in `data/config.json`
 - This prevents yt-dlp from creating duplicate files
 ### **Monitoring**
 - Check logs for "⏭️ Skipping download - file already exists" messages
 - These indicate the duplicate prevention is working correctly
 ## 🔧 Recent Bug Fixes & Improvements (v3.4.3)
 ### **Manual Video Collection System**
 - **New `--manual` parameter**: Simple access to manual video collection via `python download_karaoke.py --manual --limit 5`
 - **Static video management**: `data/manual_videos.json` stores individual karaoke videos that don't belong to regular channels
 - **Helper script**: `add_manual_video.py` provides easy management of manual video entries
 - **Full integration**: Manual videos work with all existing features (songlist matching, fuzzy matching, parallel downloads, etc.)
 - **No yt-dlp dependency**: Manual videos bypass YouTube API calls for video listing, using static data instead
 ### **Channel-Specific Parsing Rules**
 - **JSON-based configuration**: `data/channels.json` replaces `data/channels.txt` with structured channel configuration
 - **Parsing rules per channel**: Each channel can define custom parsing rules for video titles
 - **Multiple format support**: Handles various title formats like "Artist - Title", "Artist   Title", "Title | Artist", etc.
 - **Suffix cleanup**: Automatic removal of common karaoke-related suffixes
 - **Multi-artist support**: Parsing for titles with multiple artists separated by specific delimiters
 - **Backward compatibility**: Still supports legacy `data/channels.txt` format
 ### **Benefits of New Features**
 - **Flexible video management**: Easy addition of individual karaoke videos without creating new channels
 - **Accurate parsing**: Channel-specific rules ensure correct artist/title extraction for ID3 tags and filenames
 - **Consistent metadata**: Proper parsing prevents filename and ID3 tag inconsistencies
 - **Easy maintenance**: Simple JSON structure for managing both channels and manual videos
 - **Full feature compatibility**: Manual videos work seamlessly with existing download modes and features
 ## 📚 Documentation Standards
 ### **Documentation Location**
 - **All changes, refactoring, and improvements should be documented in the PRD.md and README.md files**
 - **Do NOT create separate .md files for documenting changes, refactoring, or improvements**
 - **Use the existing sections in PRD.md and README.md to track all project evolution**
 ### **Where to Document Changes**
 - **PRD.md**: Technical details, architecture changes, bug fixes, and implementation specifics
 - **README.md**: User-facing features, usage instructions, and high-level improvements
 - **CHANGELOG.md**: Version-specific release notes and change summaries
 ### **Documentation Requirements**
 - **All new features must be documented in both PRD.md and README.md**
 - **All refactoring efforts must be documented in the appropriate sections**
 - **All bug fixes must be documented with technical details**
 - **Version numbers and dates should be clearly marked**
 - **Benefits and improvements should be explicitly stated**
 ### **Maintenance Responsibility**
 - **Keep PRD.md and README.md synchronized with code changes**
 - **Update documentation immediately when implementing new features**
 - **Remove outdated information and consolidate related changes**
 - **Ensure all CLI options and features are documented in both files**
 ## 🔧 Recent Bug Fixes & Improvements (v3.4.4)
 ### **All Videos Download Mode**
 - **New `--all-videos` parameter**: Download all videos from a channel, not just songlist matches
 - **Smart MP3/MP4 detection**: Automatically detects if you have MP3 versions in songs.json and downloads MP4 video versions
 - **Existing file skipping**: Skips videos that already exist on the filesystem
 - **Progress tracking**: Shows clear progress with "Downloading X/Y videos" format
 - **Parallel processing support**: Works with `--parallel --workers N` for faster downloads
 - **Channel focus integration**: Works with `--channel-focus` to target specific channels
 - **Limit support**: Works with `--limit N` to control download batch size
 ### **Smart Songlist Integration**
 - **MP4 version detection**: Checks if MP4 version already exists in songs.json before downloading
 - **MP3 upgrade path**: Downloads MP4 video versions when only MP3 versions exist in songlist
 - **Duplicate prevention**: Skips downloads when MP4 versions already exist
 - **Efficient filtering**: Only processes videos that need to be downloaded
 ### **Benefits of All Videos Mode**
 - **Complete channel downloads**: Download entire channels without songlist restrictions
 - **Automatic format upgrading**: Upgrade MP3 collections to MP4 video versions
 - **Efficient processing**: Only downloads videos that don't already exist
 - **Flexible control**: Use with limits, parallel processing, and channel targeting
 - **Clear progress feedback**: Real-time progress tracking for large downloads
 ## 🔧 Recent Bug Fixes & Improvements (v3.4.5)
 ### **Unified Download Workflow Architecture**
 - **Unified execution pipeline**: All download modes now use the same execution workflow, eliminating inconsistencies and broken pipelines
 - **Consistent behavior**: All modes (--channel-focus, --all-videos, --songlist-only, --latest-per-channel) use identical download execution, progress tracking, and error handling
 - **Centralized download logic**: Single `execute_unified_download_workflow()` method handles all download execution
 - **Automatic parallel support**: All download modes automatically support `--parallel --workers N` without additional implementation
 - **Unified cache management**: Consistent progress tracking and resume functionality across all modes
 ### **Architecture Pattern for New Download Modes**
 When adding new download modes in the future, follow this pattern to ensure consistency:
 #### **1. Download Plan Building (Mode-Specific)**
 Each download mode should build a download plan (list of videos to download) with this structure:
 ```python
 download_plan = [
    {
        "video_id": "video_id",
        "artist": "artist_name", 
        "title": "song_title",
        "filename": "sanitized_filename.mp4",
        "channel_name": "channel_name",
        "video_title": "original_video_title",
        "force_download": False
    }
 ]
 ```
 #### **2. Unified Execution (Shared)**
 All modes should use the unified execution workflow:
 ```python
 downloaded_count, success = self.execute_unified_download_workflow(
    download_plan=download_plan,
    cache_file=cache_file,  # Optional, for progress tracking
    limit=limit,            # Optional, for limiting downloads
    show_progress=True,     # Optional, for progress display
 )
 ```
 #### **3. Execution Method Selection (Automatic)**
 The unified workflow automatically chooses execution method based on settings:
 - **Sequential**: Uses `DownloadPipeline` for single-threaded downloads
 - **Parallel**: Uses `ParallelDownloader` when `--parallel` is enabled
 #### **4. Required Implementation Pattern**
 ```python
 def download_new_mode(self, ...):
    """New download mode implementation."""
    # 1. Build download plan (mode-specific logic)
    download_plan = []
    for video in videos_to_download:
        download_plan.append({
            "video_id": video["id"],
            "artist": artist,
            "title": title,
            "filename": filename,
            "channel_name": channel_name,
            "video_title": video["title"],
            "force_download": force_download
        })
    # 2. Create cache file (optional, for progress tracking)
    cache_file = get_download_plan_cache_file("new_mode", **plan_kwargs)
    save_plan_cache(cache_file, download_plan, [])
    # 3. Use unified execution workflow
    downloaded_count, success = self.execute_unified_download_workflow(
        download_plan=download_plan,
        cache_file=cache_file,
        limit=limit,
        show_progress=True,
    )
    return success
 ```
 ### **Benefits of Unified Architecture**
 - **Consistency**: All modes behave identically for execution, progress tracking, and error handling
 - **Maintainability**: Changes to download execution only need to be made in one place
 - **Reliability**: Eliminates broken pipelines and inconsistent behavior between modes
 - **Extensibility**: New modes automatically get all existing features (parallel downloads, progress tracking, etc.)
 - **Testing**: Easier to test since all modes use the same execution logic
 ### **What Was Fixed**
 - **Broken Pipeline**: Previously, different modes used different execution paths, leading to inconsistencies
 - **Missing Method**: Added missing `download_latest_per_channel()` method that was referenced in CLI but not implemented
 - **Code Duplication**: Eliminated duplicate download execution logic across different modes
 - **Inconsistent Behavior**: All modes now have identical progress tracking, error handling, and cache management
 ### **Future Development Guidelines**
 1. **NEVER implement custom download execution logic** in new download modes
 2. **ALWAYS use `execute_unified_download_workflow()`** for download execution
 3. **Focus on download plan building** - that's where mode-specific logic belongs
 4. **Use the standard download plan structure** for consistency
 5. **Implement cache file handling** for progress tracking and resume functionality
 6. **Test with both sequential and parallel modes** to ensure compatibility
 ---
@ -534,97 +281,9 @@ def download_new_mode(self, ...):
 - [ ] Download scheduling and retry logic
 - [ ] More granular status reporting
 - [x] **Parallel downloads for improved speed** ✅ **COMPLETED**
- [x] **Enhanced fuzzy matching with improved video title parsing** ✅ **COMPLETED**
+- [x] **Cross-platform support (Windows, macOS, Linux)** ✅ **COMPLETED**
 - [x] **Consolidated extract_artist_title function** ✅ **COMPLETED**
 - [x] **Duplicate file prevention and filename consistency** ✅ **COMPLETED**
 - [ ] Unit tests for all modules
 - [ ] Integration tests for end-to-end workflows
 - [ ] Plugin system for custom file operations
 - [ ] Advanced configuration UI
 - [ ] Real-time download progress visualization
 ## 🔧 Recent Bug Fixes & Improvements (v3.4.4)
 ### **macOS Support with Automatic Platform Detection**
 - **Cross-platform compatibility**: Added support for macOS alongside Windows
 - **Automatic platform detection**: Detects operating system and selects appropriate yt-dlp binary
 - **Flexible yt-dlp integration**: Supports both binary files (`yt-dlp_macos`) and pip installation (`python3 -m yt_dlp`)
 - **Setup automation**: `setup_macos.py` script for easy macOS setup with FFmpeg and yt-dlp installation
 - **Command parsing**: Intelligent parsing of yt-dlp commands (file paths vs. module commands)
 - **Enhanced validation**: Platform-specific error messages and validation in CLI
 - **Backward compatibility**: Maintains full compatibility with existing Windows installations
 ### **Benefits of macOS Support**
 - **Native macOS experience**: No need for Windows compatibility layers or virtualization
 - **Automatic setup**: Simple setup script handles all dependencies
 - **Flexible installation**: Choose between binary download or pip installation
 - **Consistent functionality**: All features work identically on both platforms
 - **Easy maintenance**: Platform detection handles configuration automatically
 ### **Setup Instructions**
 ```bash
 # Automatic setup (recommended)
 python3 setup_macos.py
 # Test installation
 python3 src/tests/test_macos.py
 # Manual setup options
 # 1. Install yt-dlp via pip: pip3 install yt-dlp
 # 2. Download binary: curl -L -o downloader/yt-dlp_macos https://github.com/yt-dlp/yt-dlp/releases/latest/download/yt-dlp_macos
 # 3. Install FFmpeg: brew install ffmpeg
 ```
 ## 🔧 Recent Bug Fixes & Improvements (v3.4.7)
 ### **Configurable Data Directory Path**
 - **Centralized Data Path Management**: New `data_path_manager.py` module provides unified data directory path management
 - **Configurable Location**: Data directory path can be set in `config/config.json` under `folder_structure.data_dir`
 - **Backward Compatibility**: Defaults to "data" directory if not configured
 - **Cross-Project Integration**: Enables the karaoke downloader to be used as a component in other projects with different data directory structures
 - **Updated All Modules**: All modules now use the data path manager instead of hardcoded "data/" paths
 - **Utility Functions**: Provides `get_data_path()`, `get_data_dir()`, and `get_data_path_manager()` functions for easy access
 - **Fixed Circular Dependency**: Moved `config.json` from `data/` to root directory to resolve chicken-and-egg problem
 ### **Benefits of Configurable Data Directory**
 - **Flexible Deployment**: Can be integrated into other projects with different directory structures
 - **Centralized Configuration**: Single point of configuration for all data file paths
 - **Maintainable Code**: Eliminates hardcoded paths throughout the codebase
 - **Easy Testing**: Can use temporary directories for testing without affecting production data
 - **Future-Proof**: Makes it easier to change data directory structure in the future
 ### **Circular Dependency Solution**
 The original implementation had a circular dependency problem:
 - **Problem**: `config.json` was located in the `data/` directory
 - **Issue**: To read the config file, we needed to know where the data directory is
 - **Conflict**: But the data directory location is specified in the config file
 - **Solution**: Moved `config.json` to the `config/` directory as a fixed location
 - **Result**: Config file is always accessible in a dedicated config directory, and data directory can be configured within it
 - **Backward Compatibility**: System still works with config files in custom data directories when explicitly specified
 ## 🔧 Recent Bug Fixes & Improvements (v3.4.6)
 ### **Dry Run Mode**
 - **New `--dry-run` parameter**: Build download plan and show what would be downloaded without actually downloading anything
 - **Plan preview**: Shows total videos in plan and preview of first 5 videos
 - **Safe testing**: Test download configurations without consuming bandwidth or disk space
 - **All mode support**: Works with all download modes (--channel-focus, --all-videos, --songlist-only, --latest-per-channel)
 - **Progress simulation**: Shows what the download process would look like without executing it
 ### **Benefits of Dry Run Mode**
 - **Safe testing**: Test complex download configurations without downloading anything
 - **Plan validation**: Verify that the download plan contains the expected videos
 - **Configuration debugging**: Troubleshoot download settings before committing to downloads
 - **Resource conservation**: Save bandwidth and disk space during testing
 - **User education**: Help users understand what the tool will do before running it
 ### **Example Usage**
 ```bash
 # Test songlist download plan
 python download_karaoke.py --songlist-only --limit 5 --dry-run
 # Test channel download plan
 python download_karaoke.py --channel-focus SingKingKaraoke --all-videos --limit 10 --dry-run
 # Test with fuzzy matching
 python download_karaoke.py --songlist-only --fuzzy-match --limit 3 --dry-run
 ```
 ### **Future Development Guidelines**
--- a/README.md
+++ b/README.md
@ -1,6 +1,6 @@
 # 🎤 Karaoke Video Downloader
-A Python-based cross-platform CLI tool to download karaoke videos from YouTube channels/playlists using `yt-dlp`, with advanced tracking, songlist prioritization, and flexible configuration. Supports Windows and macOS with automatic platform detection.
+A Python-based cross-platform CLI tool to download karaoke videos from YouTube channels/playlists using `yt-dlp`, with advanced tracking, songlist prioritization, and flexible configuration. Supports Windows, macOS, and Linux with automatic platform detection, optimized caching, and FFmpeg integration.
 ## ✨ Features
 - 🎵 **Channel & Playlist Downloads**: Download all videos from a YouTube channel or playlist
@ -13,7 +13,7 @@ A Python-based cross-platform CLI tool to download karaoke videos from YouTube c
 - 📈 **Real-Time Progress**: Detailed console and log output
 - 🧹 **Reset/Clear Channel**: Reset all tracking and files for a channel, or clear channel cache via CLI
 - 🗂️ **Latest-per-channel download**: Download the latest N videos from each channel in a single batch, with server deduplication, fuzzy matching support, per-channel download plan, robust resume, and unique plan cache. Use --latest-per-channel and --limit N.
- 🧩 **Enhanced Fuzzy Matching**: Advanced fuzzy string matching for songlist-to-video matching with improved video title parsing (handles multiple title formats like "Title Karaoke | Artist Karaoke Version")
+- 🧩 **Fuzzy Matching**: Optionally use fuzzy string matching for songlist-to-video matching (with --fuzzy-match, requires rapidfuzz for best results)
 - ⚡ **Fast Mode with Early Exit**: When a limit is set, scans channels and songs in order, downloads immediately when a match is found, and stops as soon as the limit is reached with successful downloads
 - 🔄 **Deduplication Across Channels**: Ensures the same song is not downloaded from multiple channels, even if it appears in more than one channel's video list
 - 📋 **Default Channel File**: Automatically uses data/channels.txt as the default channel list for songlist modes (no need to specify --file every time)
@ -21,20 +21,13 @@ A Python-based cross-platform CLI tool to download karaoke videos from YouTube c
 - ⚡ **Optimized Scanning**: High-performance channel scanning with O(n×m) complexity, pre-processed lookups, and early termination for faster matching
 - 🏷️ **Server Duplicates Tracking**: Automatically checks against local songs.json file and marks duplicates for future skipping, preventing re-downloads of songs already on the server
 - ⚡ **Parallel Downloads**: Enable concurrent downloads with `--parallel --workers N` for significantly faster batch downloads (3-5x speedup)
- 📊 **Unmatched Songs Reports**: Generate detailed reports of songs that couldn't be found in any channel with `--generate-unmatched-report`
+- 🌐 **Cross-Platform Support**: Automatic platform detection and yt-dlp integration for Windows, macOS, and Linux
- 🛡️ **Duplicate File Prevention**: Automatically detects and prevents duplicate files with `(2)`, `(3)` suffixes, with cleanup utility for existing duplicates
+- 🚀 **Optimized Caching**: Enhanced channel video caching with instant video list loading
- 🏷️ **Consistent Metadata**: Filename and ID3 tag use identical artist/title format for clear file identification
+- 🎬 **FFmpeg Integration**: Automatic FFmpeg installation and configuration for optimal video processing
 - 🍎 **macOS Support**: Automatic platform detection and setup with native macOS binaries and FFmpeg integration
 ## 🏗️ Architecture
 The codebase has been comprehensively refactored into a modular architecture with centralized utilities for improved maintainability, error handling, and code reuse:
 ### **Configurable Data Directory (v3.4.7)**
 - **Centralized Data Path Management**: `data_path_manager.py` provides unified data directory path management
 - **Configurable Location**: Data directory path can be set in `config/config.json` under `folder_structure.data_dir`
 - **Backward Compatibility**: Defaults to "data" directory if not configured
 - **Cross-Project Integration**: Enables the karaoke downloader to be used as a component in other projects with different data directory structures
 ### Core Modules:
 - **`downloader.py`**: Main orchestrator and CLI interface
 - **`video_downloader.py`**: Core video download execution and orchestration
@ -56,191 +49,90 @@ The codebase has been comprehensively refactored into a modular architecture wit
 - **`tracking_cli.py`**: Tracking management CLI
 ### New Utility Modules (v3.3):
 - **`parallel_downloader.py`**: Parallel download management with thread-safe operations
  - `ParallelDownloader` class: Manages concurrent downloads with configurable workers
  - `DownloadTask` and `DownloadResult` dataclasses: Structured task and result management
  - Thread-safe progress tracking and error handling
  - Automatic retry mechanism for failed downloads
 - **`file_utils.py`**: Centralized file operations, filename sanitization, and file validation
- **`song_validator.py`**: Centralized song validation logic for checking if songs should be downloaded
+  - `sanitize_filename()`: Create safe filenames from artist/title
  - `generate_possible_filenames()`: Generate filename patterns for different modes
  - `check_file_exists_with_patterns()`: Check for existing files using multiple patterns
  - `is_valid_mp4_file()`: Validate MP4 files with header checking
  - `cleanup_temp_files()`: Remove temporary yt-dlp files
  - `ensure_directory_exists()`: Safe directory creation
-### New Utility Modules (v3.4.7):
+- **`song_validator.py`**: Centralized song validation logic
- **`data_path_manager.py`**: Centralized data directory path management and file path resolution
+  - `SongValidator` class: Unified logic for checking if songs should be downloaded
  - `should_skip_song()`: Comprehensive validation with multiple criteria
  - `mark_song_failed()`: Consistent failure tracking
  - `handle_download_failure()`: Standardized error handling
-### **Unified Download Workflow (v3.4.5)**
+- **Enhanced `config_manager.py`**: Robust configuration management with dataclasses
- **`execute_unified_download_workflow()`**: Centralized download execution that all modes use
+  - `ConfigManager` class: Type-safe configuration loading and caching
- **`_execute_sequential_downloads()`**: Sequential download execution using DownloadPipeline
+  - `DownloadSettings`, `FolderStructure`, `LoggingConfig` dataclasses
- **`_execute_parallel_downloads()`**: Parallel download execution using ParallelDownloader
+  - Configuration validation and merging with defaults
  - Dynamic resolution updates
-### **Benefits of Enhanced Modular Architecture:**
+### Benefits:
 - **Single Responsibility**: Each module has a focused purpose
 - **Centralized Utilities**: Common operations (file operations, song validation, yt-dlp commands, error handling) are centralized
 - **Reduced Duplication**: Eliminated ~150 lines of code duplication across modules
 - **Testability**: Individual components can be tested separately
 - **Maintainability**: Easier to find and fix issues
 - **Reusability**: Components can be used independently
 - **Robustness**: Better error handling and interruption recovery
 - **Consistency**: Standardized error messages and processing pipelines
 - **Maintainability**: Changes isolated to specific modules
 - **Testability**: Modular components can be tested independently
 - **Type Safety**: Comprehensive type hints across all new modules
 - **Unified Execution**: All download modes use the same execution pipeline for consistency
 ## 🔧 Development Guidelines
 ### **Adding New Download Modes**
 When adding new download modes, follow the unified workflow pattern to ensure consistency:
 #### **1. Build Download Plan (Mode-Specific)**
 ```python
 def download_new_mode(self, ...):
    # Build download plan with standard structure
    download_plan = []
    for video in videos_to_download:
        download_plan.append({
            "video_id": video["id"],
            "artist": artist,
            "title": title,
            "filename": filename,
            "channel_name": channel_name,
            "video_title": video["title"],
            "force_download": force_download
        })
    # Use unified execution workflow
    downloaded_count, success = self.execute_unified_download_workflow(
        download_plan=download_plan,
        cache_file=cache_file,
        limit=limit,
        show_progress=True,
    )
    return success
 ```
 #### **2. Key Principles**
 - **NEVER implement custom download execution logic** - always use `execute_unified_download_workflow()`
 - **Focus on download plan building** - that's where mode-specific logic belongs
 - **Use the standard download plan structure** for consistency
 - **Implement cache file handling** for progress tracking and resume functionality
 - **Test with both sequential and parallel modes** to ensure compatibility
 #### **3. Benefits of Unified Architecture**
 - **Consistency**: All modes behave identically for execution, progress tracking, and error handling
 - **Automatic Features**: New modes automatically get parallel downloads, progress tracking, and cache management
 - **Maintainability**: Changes to download execution only need to be made in one place
 - **Reliability**: Eliminates broken pipelines and inconsistent behavior between modes
 ## 🔧 Recent Improvements (v3.4.1)
 ### **Enhanced Fuzzy Matching**
 - **Improved title parsing**: Enhanced `extract_artist_title` function to handle multiple video title formats
 - **Better matching accuracy**: Can now parse titles like "Hold On Loosely Karaoke | 38 Special Karaoke Version"
 - **Consistent parsing**: All modules now use the same parsing logic from `fuzzy_matcher.py`
 - **Reduced false negatives**: Songs that previously couldn't be matched due to title format differences now have a higher chance of being found
 ### **Fixed Import Conflicts**
 - **Resolved import conflicts**: Updated modules to use the enhanced `extract_artist_title` from `fuzzy_matcher.py`
 - **Consistent behavior**: All parts of the system use the same parsing logic
 - **Cleaner codebase**: Eliminated duplicate code and import conflicts
 ### **Fixed --limit Parameter**
 - **Correct limit application**: The `--limit` parameter now properly limits the scanning phase, not just downloads
 - **Improved performance**: When using `--limit N`, only the first N songs are scanned, significantly reducing processing time
 - **Accurate logging**: Logging messages now show the correct counts for songs that will actually be processed when using `--limit`
 ### **Code Quality Improvements**
 - **Eliminated duplicate functions**: Removed duplicate `extract_artist_title` implementations
 - **Fixed import conflicts**: Resolved inconsistencies between different parsing implementations
 - **Single source of truth**: All title parsing logic is now centralized in `fuzzy_matcher.py`
 ## 🔧 Recent Improvements (v3.4.5)
 ### **Unified Download Workflow Architecture**
 - **Unified execution pipeline**: All download modes now use the same execution workflow, eliminating inconsistencies and broken pipelines
 - **Consistent behavior**: All modes (--channel-focus, --all-videos, --songlist-only, --latest-per-channel) use identical download execution, progress tracking, and error handling
 - **Centralized download logic**: Single `execute_unified_download_workflow()` method handles all download execution
 - **Automatic parallel support**: All download modes automatically support `--parallel --workers N` without additional implementation
 - **Unified cache management**: Consistent progress tracking and resume functionality across all modes
 ### **What Was Fixed**
 - **Broken Pipeline**: Previously, different modes used different execution paths, leading to inconsistencies
 - **Missing Method**: Added missing `download_latest_per_channel()` method that was referenced in CLI but not implemented
 - **Code Duplication**: Eliminated duplicate download execution logic across different modes
 - **Inconsistent Behavior**: All modes now have identical progress tracking, error handling, and cache management
 ### **Benefits**
 - ✅ **Consistency**: All modes behave identically for execution, progress tracking, and error handling
 - ✅ **Maintainability**: Changes to download execution only need to be made in one place
 - ✅ **Reliability**: Eliminates broken pipelines and inconsistent behavior between modes
 - ✅ **Extensibility**: New modes automatically get all existing features (parallel downloads, progress tracking, etc.)
 - ✅ **Testing**: Easier to test since all modes use the same execution logic
 ## 🛡️ Duplicate File Prevention & Filename Consistency (v3.4.2)
 ### **Duplicate File Prevention**
 - **Enhanced file existence checking**: Now detects files with `(2)`, `(3)`, etc. suffixes that yt-dlp creates
 - **Automatic duplicate prevention**: Skips downloads when files already exist (including duplicates)
 - **Updated yt-dlp configuration**: Set `"nooverwrites": false` to prevent yt-dlp from creating duplicate files
 - **Cleanup utility**: `data/cleanup_duplicate_files.py` helps identify and remove existing duplicate files
 ### **Filename vs ID3 Tag Consistency**
 - **Consistent metadata**: Filename and ID3 tag now use identical artist/title format
 - **Removed extra suffixes**: No more "(Karaoke Version)" in ID3 tags that don't match filenames
 - **Unified parsing**: Both filename generation and ID3 tagging use the same artist/title extraction
 ### **Benefits**
 - ✅ **No more duplicate files** with `(2)`, `(3)` suffixes
 - ✅ **Consistent metadata** between filename and ID3 tags
 - ✅ **Efficient disk usage** by preventing unnecessary downloads
 - ✅ **Clear file identification** with consistent naming
 ### **Clean Up Existing Duplicates**
 ```bash
 # Run the cleanup utility to find and remove existing duplicates
 python data/cleanup_duplicate_files.py
 # Choose option 1 for dry run (recommended first)
 # Choose option 2 to actually delete duplicates
 ```
 ## 📋 Requirements
- **Windows 10/11 or macOS 10.14+**
+- **Windows 10/11, macOS 10.14+, or Linux**
 - **Python 3.7+**
 - **yt-dlp binary** (platform-specific, see setup instructions below)
 - **mutagen** (for ID3 tagging, optional)
 - **ffmpeg/ffprobe** (for video validation, optional but recommended)
 - **rapidfuzz** (for fuzzy matching, optional, falls back to difflib)
-## 🍎 macOS Setup
+## 🖥️ Platform Setup
 ### Automatic Setup (Recommended)
-Run the macOS setup script to automatically set up yt-dlp and FFmpeg:
+Run the platform setup script to automatically set up yt-dlp for your system:
 ```bash
-python3 setup_macos.py
+python setup_platform.py
 ```
 This script will:
- Detect your macOS version
+- Detect your platform (Windows, macOS, or Linux)
- Offer installation options for yt-dlp (pip or binary download)
+- Offer two installation options:
- Install FFmpeg via Homebrew
+  1. **Download binary file** (recommended for most users)
  2. **Install via pip** (alternative method)
 - Make binaries executable (on Unix-like systems)
 - Install FFmpeg (for optimal video processing)
 - Test the installation
 ### Manual Setup
 If you prefer to set up manually:
-#### Option 1: Install yt-dlp via pip
+#### Option 1: Download Binary Files
 1. **Windows**: Download `yt-dlp.exe` from [yt-dlp releases](https://github.com/yt-dlp/yt-dlp/releases/latest/download/yt-dlp.exe)
 2. **macOS**: Download `yt-dlp_macos` from [yt-dlp releases](https://github.com/yt-dlp/yt-dlp/releases/latest/download/yt-dlp_macos)
 3. **Linux**: Download `yt-dlp` from [yt-dlp releases](https://github.com/yt-dlp/yt-dlp/releases/latest/download/yt-dlp)
 Place the downloaded file in the `downloader/` directory and make it executable on Unix-like systems:
 ```bash
-pip3 install yt-dlp
+chmod +x downloader/yt-dlp_macos  # macOS
 chmod +x downloader/yt-dlp        # Linux
 ```
-#### Option 2: Download yt-dlp binary
+#### Option 2: Install via pip
 ```bash
-mkdir -p downloader
+pip install yt-dlp
 curl -L -o downloader/yt-dlp_macos https://github.com/yt-dlp/yt-dlp/releases/latest/download/yt-dlp_macos
 chmod +x downloader/yt-dlp_macos
 ```
-#### Install FFmpeg
+The tool will automatically detect and use the pip-installed version on macOS.
 ```bash
 brew install ffmpeg
 ```
-### Test Installation
+**Note**: FFmpeg is also required for optimal video processing. The setup script will attempt to install it automatically, or you can install it manually:
-```bash
+- **macOS**: `brew install ffmpeg`
-python3 src/tests/test_macos.py
+- **Linux**: `sudo apt install ffmpeg` (Ubuntu/Debian) or `sudo yum install ffmpeg` (CentOS/RHEL)
-```
+- **Windows**: Download from [ffmpeg.org](https://ffmpeg.org/download.html)
 ## 🚀 Quick Start
@ -251,21 +143,6 @@ python3 src/tests/test_macos.py
 python download_karaoke.py https://www.youtube.com/@SingKingKaraoke/videos
 ```
 ### Download ALL Videos from a Channel (Not Just Songlist Matches)
 ```bash
 python download_karaoke.py --channel-focus SingKingKaraoke --all-videos
 ```
 ### Download ALL Videos with Parallel Processing
 ```bash
 python download_karaoke.py --channel-focus SingKingKaraoke --all-videos --parallel --workers 10
 ```
 ### Download ALL Videos with Limit
 ```bash
 python download_karaoke.py --channel-focus SingKingKaraoke --all-videos --limit 100
 ```
 ### Download Only Songlist Songs (Fast Mode)
 ```bash
 python download_karaoke.py --songlist-only --limit 5
@ -273,7 +150,7 @@ python download_karaoke.py --songlist-only --limit 5
 ### Download with Parallel Processing
 ```bash
-python download_karaoke.py --parallel --songlist-only --limit 10
+python download_karaoke.py --parallel --workers 5 --songlist-only --limit 10
 ```
 ### Focus on Specific Playlists by Title
@ -281,31 +158,11 @@ python download_karaoke.py --parallel --songlist-only --limit 10
 python download_karaoke.py --songlist-focus "2025 - Apple Top 50" "2024 - Billboard Hot 100"
 ```
 ### Focus on Specific Playlists from Custom File
 ```bash
 python download_karaoke.py --songlist-focus "CCKaraoke" --songlist-file "data/my_custom_songlist.json"
 ```
 ### Force Download from Channels (Bypass All Existing File Checks)
 ```bash
 python download_karaoke.py --songlist-focus "2025 - Apple Top 50" --force
 ```
 ### Download with Fuzzy Matching
 ```bash
 python download_karaoke.py --songlist-only --limit 10 --fuzzy-match --fuzzy-threshold 85
 ```
 ### Test Download Plan (Dry Run)
 ```bash
 python download_karaoke.py --songlist-only --limit 5 --dry-run
 ```
 ### Test Channel Download Plan (Dry Run)
 ```bash
 python download_karaoke.py --channel-focus SingKingKaraoke --all-videos --limit 10 --dry-run
 ```
 ### Download Latest N Videos Per Channel
 ```bash
 python download_karaoke.py --latest-per-channel --limit 5
@ -410,33 +267,23 @@ KaroakeVideoDownloader/
 │   ├── check_resolution.py     # Resolution checker utility
 │   ├── resolution_cli.py       # Resolution config CLI
 │   └── tracking_cli.py         # Tracking management CLI
-├── config/                   # Configuration files
+├── data/                      # All config, tracking, cache, and songlist files
-│   └── config.json          # Main configuration file
+│   ├── config.json
 ├── data/                     # All tracking, cache, and songlist files
 │   ├── karaoke_tracking.json
 │   ├── songlist_tracking.json
 │   ├── channel_cache.json
-│   ├── channels.json          # Channel configuration with parsing rules
+│   ├── channels.txt
 │   └── songList.json
 ├── utilities/                # Utility scripts and tools
 │   ├── add_manual_video.py  # Manual video management
 │   ├── build_cache_from_raw.py # Cache building utility
 │   ├── cleanup_duplicate_files.py # File cleanup utilities
 │   ├── cleanup_recent_tracking.py # Tracking cleanup utilities
 │   ├── deduplicate_songlist_tracking.py # Data deduplication
 │   ├── fix_artist_name_format.py # Data cleanup utilities
 │   ├── fix_artist_name_format_simple.py
 │   ├── fix_code_quality.py  # Development tools
 │   ├── reset_and_redownload.py # Maintenance utilities
 │   └── songlist_report.py   # Reporting utilities
 ├── downloads/                 # All video output
 │   └── [ChannelName]/         # Per-channel folders
 ├── logs/                      # Download logs
 ├── downloader/yt-dlp.exe      # yt-dlp binary (Windows)
 ├── downloader/yt-dlp_macos    # yt-dlp binary (macOS) 
-├── src/tests/                 # Test scripts
+├── downloader/yt-dlp          # yt-dlp binary (Linux)
-│   ├── test_macos.py         # macOS setup and functionality tests
+├── setup_platform.py          # Platform setup script
-│   └── test_platform.py      # Platform detection tests
+├── test_platform.py           # Platform test script
 ├── tests/                     # Diagnostic and test scripts
 │   └── test_installation.py
 ├── download_karaoke.py        # Main entry point (thin wrapper)
 ├── README.md
 ├── PRD.md
@ -453,7 +300,6 @@ KaroakeVideoDownloader/
 - `--songlist-priority`: Prioritize songlist songs in download queue
 - `--songlist-only`: Download only songs from the songlist
 - `--songlist-focus <PLAYLIST_TITLE1> <PLAYLIST_TITLE2>...`: Focus on specific playlists by title (e.g., `--songlist-focus "2025 - Apple Top 50" "2024 - Billboard Hot 100"`)
 - `--songlist-file <FILE_PATH>`: Custom songlist file path to use with --songlist-focus (default: data/songList.json)
 - `--songlist-status`: Show songlist download progress
 - `--limit <N>`: Limit number of downloads (enables fast mode with early exit)
 - `--resolution <720p|1080p|...>`: Override resolution
@ -465,14 +311,8 @@ KaroakeVideoDownloader/
 - `--latest-per-channel`: **Download the latest N videos from each channel (use with --limit)**
 - `--fuzzy-match`: Enable fuzzy matching for songlist-to-video matching (uses rapidfuzz if available)
 - `--fuzzy-threshold <N>`: Fuzzy match threshold (0-100, default 85)
- `--parallel`: Enable parallel downloads for improved speed (defaults to 3 workers)
+- `--parallel`: Enable parallel downloads for improved speed
- `--workers <N>`: Number of parallel download workers (1-10, default: 3, only used with --parallel)
+- `--workers <N>`: Number of parallel download workers (1-10, default: 3)
 - `--generate-songlist <DIR1> <DIR2>...`: **Generate song list from MP4 files with ID3 tags in specified directories**
 - `--no-append-songlist`: **Create a new song list instead of appending when using --generate-songlist**
 - `--force`: **Force download from channels, bypassing all existing file checks and re-downloading if necessary**
 - `--channel-focus <CHANNEL_NAME>`: **Download from a specific channel by name (e.g., 'SingKingKaraoke')**
 - `--all-videos`: **Download all videos from channel (not just songlist matches), skipping existing files**
 - `--dry-run`: **Build download plan and show what would be downloaded without actually downloading anything**
 ## 📝 Example Usage
@ -483,61 +323,30 @@ KaroakeVideoDownloader/
 python download_karaoke.py --songlist-only --limit 10 --fuzzy-match --fuzzy-threshold 85
 # Parallel downloads for faster processing
-python download_karaoke.py --parallel --songlist-only --limit 10
+python download_karaoke.py --parallel --workers 5 --songlist-only --limit 10
 # Latest videos per channel with parallel downloads
-python download_karaoke.py --parallel --latest-per-channel --limit 5
+python download_karaoke.py --parallel --workers 3 --latest-per-channel --limit 5
 # Traditional full scan (no limit)
 python download_karaoke.py --songlist-only
 # Focused fuzzy matching (target specific playlists with flexible matching)
 python download_karaoke.py --songlist-focus "2025 - Apple Top 50" --fuzzy-match --fuzzy-threshold 80 --limit 10
 # Focus on specific playlists from a custom file
 python download_karaoke.py --songlist-focus "CCKaraoke" --songlist-file "data/my_custom_songlist.json" --limit 10
 # Force download with fuzzy matching (bypass all existing file checks)
 python download_karaoke.py --songlist-focus "2025 - Apple Top 50" --force --fuzzy-match --fuzzy-threshold 80 --limit 10
 # Channel-specific operations
 python download_karaoke.py --reset-channel SingKingKaraoke
 python download_karaoke.py --reset-channel SingKingKaraoke --reset-songlist
 python download_karaoke.py --clear-cache all
 python download_karaoke.py --clear-server-duplicates
 # Download ALL videos from a specific channel
 python download_karaoke.py --channel-focus SingKingKaraoke --all-videos
 python download_karaoke.py --channel-focus SingKingKaraoke --all-videos --parallel --workers 10
 python download_karaoke.py --channel-focus SingKingKaraoke --all-videos --limit 100
 # Song list generation from MP4 files
 python download_karaoke.py --generate-songlist /path/to/mp4/directory
 python download_karaoke.py --generate-songlist /path/to/dir1 /path/to/dir2 --no-append-songlist
 # Generate report of songs that couldn't be found
 python download_karaoke.py --generate-unmatched-report
 python download_karaoke.py --generate-unmatched-report --fuzzy-match --fuzzy-threshold 85
 ```
 ## 🏷️ ID3 Tagging
 - Adds artist/title/album/genre to MP4 files using mutagen (if installed)
 ## 📋 Song List Generation
 - **Generate song lists from existing MP4 files**: Use `--generate-songlist` to create song lists from directories containing MP4 files with ID3 tags
 - **Automatic ID3 extraction**: Extracts artist and title from MP4 files' ID3 tags
 - **Directory-based organization**: Each directory becomes a playlist with the directory name as the title
 - **Position tracking**: Songs are numbered starting from 1 based on file order
 - **Append or replace**: Choose to append to existing song list or create a new one with `--no-append-songlist`
 - **Multiple directories**: Process multiple directories in a single command
 ## 🧹 Cleanup
 - Removes `.info.json` and `.meta` files after download
 ## 🛠️ Configuration
- All options are in `config/config.json` (format, resolution, metadata, etc.)
+- All options are in `data/config.json` (format, resolution, metadata, etc.)
 - You can edit this file or use CLI flags to override
 - **Configurable Data Directory**: The data directory path can be configured in `config/config.json` under `folder_structure.data_dir` (default: "data")
 ## 📋 Command Reference File
@ -553,32 +362,7 @@ python download_karaoke.py --generate-unmatched-report --fuzzy-match --fuzzy-thr
 > **🔄 Maintenance Note**: The `commands.txt` file should be kept up to date with any CLI changes. When adding new command-line options or modifying existing ones, update this file to reflect all available commands and their usage.
-## 📚 Documentation Standards
+## 🔧 Refactoring Improvements (v3.5)
 ### **Documentation Location**
 - **All changes, refactoring, and improvements should be documented in the PRD.md and README.md files**
 - **Do NOT create separate .md files for documenting changes, refactoring, or improvements**
 - **Use the existing sections in PRD.md and README.md to track all project evolution**
 ### **Where to Document Changes**
 - **PRD.md**: Technical details, architecture changes, bug fixes, and implementation specifics
 - **README.md**: User-facing features, usage instructions, and high-level improvements
 - **CHANGELOG.md**: Version-specific release notes and change summaries
 ### **Documentation Requirements**
 - **All new features must be documented in both PRD.md and README.md**
 - **All refactoring efforts must be documented in the appropriate sections**
 - **All bug fixes must be documented with technical details**
 - **Version numbers and dates should be clearly marked**
 - **Benefits and improvements should be explicitly stated**
 ### **Maintenance Responsibility**
 - **Keep PRD.md and README.md synchronized with code changes**
 - **Update documentation immediately when implementing new features**
 - **Remove outdated information and consolidate related changes**
 - **Ensure all CLI options and features are documented in both files**
 ## 🔧 Refactoring Improvements (v3.3)
 The codebase has been comprehensively refactored to improve maintainability and reduce code duplication. Recent improvements have enhanced reliability, performance, and code organization:
 ### **New Utility Modules (v3.3)**
@ -613,9 +397,20 @@ The codebase has been comprehensively refactored to improve maintainability and
 - **Improved Testability**: Modular components can be tested independently
 - **Better Developer Experience**: Clear function signatures and comprehensive documentation
 ### **Cross-Platform Support (v3.5)**
 - **Platform detection:** Automatic detection of Windows, macOS, and Linux systems
 - **Flexible yt-dlp integration:** Supports both binary files and pip-installed yt-dlp modules
 - **Platform-specific configuration:** Automatic selection of appropriate yt-dlp binary/command for each platform
 - **Setup automation:** `setup_platform.py` script for easy platform-specific setup
 - **Command parsing:** Intelligent parsing of yt-dlp commands (file paths vs. module commands)
 - **Enhanced documentation:** Platform-specific setup instructions and troubleshooting
 - **Backward compatibility:** Maintains full compatibility with existing Windows installations
 - **FFmpeg integration:** Automatic FFmpeg installation and configuration for optimal video processing
 - **Optimized caching:** Enhanced channel video caching with format compatibility and instant video list loading
 ### **New Parallel Download System (v3.4)**
 - **Parallel downloader module:** `parallel_downloader.py` provides thread-safe concurrent download management
- **Configurable concurrency:** Use `--parallel` to enable parallel downloads with 3 workers by default, or `--parallel --workers N` for custom worker count (1-10)
+- **Configurable concurrency:** Use `--parallel --workers N` to enable parallel downloads with N workers (1-10)
 - **Thread-safe operations:** All tracking, caching, and progress operations are thread-safe
 - **Real-time progress tracking:** Shows active downloads, completion status, and overall progress
 - **Automatic retry mechanism:** Failed downloads are automatically retried with reduced concurrency
@ -639,8 +434,11 @@ The codebase has been comprehensively refactored to improve maintainability and
 - **Robust download plan execution:** Fixed index management in download plan execution to prevent errors during interrupted downloads.
 ## 🐞 Troubleshooting
- **Windows**: Ensure `yt-dlp.exe` is in the `downloader/` folder
+- **Platform-specific yt-dlp setup**: 
- **macOS**: Run `python3 setup_macos.py` to set up yt-dlp and FFmpeg
+  - **Windows**: Ensure `yt-dlp.exe` is in the `downloader/` folder
  - **macOS**: Either ensure `yt-dlp_macos` is in the `downloader/` folder (make executable with `chmod +x`) OR install via pip (`pip install yt-dlp`)
  - **Linux**: Ensure `yt-dlp` is in the `downloader/` folder (make executable with `chmod +x`)
 - Run `python setup_platform.py` to automatically set up yt-dlp for your platform
 - Check `logs/` for error details
 - Use `python -m karaoke_downloader.check_resolution` to verify video quality
 - If you see errors about ffmpeg/ffprobe, install [ffmpeg](https://ffmpeg.org/download.html) and ensure it is in your PATH
--- a/commands.txt
+++ b/commands.txt
@ -1,6 +1,6 @@
 # 🎤 Karaoke Video Downloader - CLI Commands Reference
 # Copy and paste these commands into your terminal
-# Updated: v3.4.4 (includes macOS support, all videos download mode, manual video collection, channel parsing rules, and all previous improvements)
+# Updated: v3.5 (includes cross-platform support, optimized caching, and all refactoring improvements)
 ## 📥 BASIC DOWNLOADS
@ -8,7 +8,7 @@
 python download_karaoke.py https://www.youtube.com/@SingKingKaraoke/videos
 # Download from a file containing multiple channel URLs
-python download_karaoke.py --file data/channels.json
+python download_karaoke.py --file data/channels.txt
 # Download with custom resolution (480p, 720p, 1080p, 1440p, 2160p)
 python download_karaoke.py --resolution 1080p https://www.youtube.com/@SingKingKaraoke/videos
@ -19,69 +19,9 @@ python download_karaoke.py --limit 10 https://www.youtube.com/@SingKingKaraoke/v
 # Enable parallel downloads for faster processing (3-5x speedup)
 python download_karaoke.py --parallel --workers 5 --limit 10 https://www.youtube.com/@SingKingKaraoke/videos
 ## 🎤 MANUAL VIDEO COLLECTION (v3.4.3)
 # Download from manual videos collection (data/manual_videos.json)
 python download_karaoke.py --manual --limit 5
 # Download manual videos with fuzzy matching
 python download_karaoke.py --manual --fuzzy-match --fuzzy-threshold 85 --limit 10
 # Download manual videos with parallel processing
 python download_karaoke.py --parallel --workers 3 --manual --limit 5
 # Download manual videos with songlist matching
 python download_karaoke.py --manual --songlist-only --limit 10
 # Force download from manual videos (bypass existing file checks)
 python download_karaoke.py --manual --force --limit 5
 # Add a video to manual collection (interactive)
 python utilities/add_manual_video.py add "Artist - Song Title (Karaoke Version)" "https://www.youtube.com/watch?v=VIDEO_ID"
 # List all manual videos
 python utilities/add_manual_video.py list
 # Remove a video from manual collection
 python utilities/add_manual_video.py remove "Artist - Song Title (Karaoke Version)"
 ## 🎬 ALL VIDEOS DOWNLOAD MODE (v3.4.4)
 # Download ALL videos from a specific channel (not just songlist matches)
 python download_karaoke.py --channel-focus SingKingKaraoke --all-videos
 # Download ALL videos with parallel processing for speed
 python download_karaoke.py --channel-focus SingKingKaraoke --all-videos --parallel --workers 10
 # Download ALL videos with limit (download first N videos)
 python download_karaoke.py --channel-focus SingKingKaraoke --all-videos --limit 100
 # Download ALL videos with parallel processing and limit
 python download_karaoke.py --channel-focus SingKingKaraoke --all-videos --parallel --workers 5 --limit 50
 # Download ALL videos from ZoomKaraokeOfficial channel
 python download_karaoke.py --channel-focus ZoomKaraokeOfficial --all-videos
 # Download ALL videos with custom resolution
 python download_karaoke.py --channel-focus SingKingKaraoke --all-videos --resolution 1080p
 ## 📋 SONG LIST GENERATION
 # Generate song list from MP4 files in a directory (append to existing song list)
 python download_karaoke.py --generate-songlist /path/to/mp4/directory
 # Generate song list from multiple directories
 python download_karaoke.py --generate-songlist /path/to/dir1 /path/to/dir2 /path/to/dir3
 # Generate song list and create a new song list file (don't append)
 python download_karaoke.py --generate-songlist /path/to/mp4/directory --no-append-songlist
 # Generate song list from multiple directories and create new file
 python download_karaoke.py --generate-songlist /path/to/dir1 /path/to/dir2 --no-append-songlist
 ## 🎵 SONGLIST OPERATIONS
-# Download only songs from your songlist (uses data/channels.json by default)
+# Download only songs from your songlist (uses data/channels.txt by default)
 python download_karaoke.py --songlist-only
 # Download only songlist songs with limit
@ -111,18 +51,6 @@ python download_karaoke.py --songlist-focus "2025 - Apple Top 50" --limit 5
 # Focus on specific playlists with parallel processing
 python download_karaoke.py --parallel --workers 3 --songlist-focus "2025 - Apple Top 50" --limit 5
 # Focus on specific playlists from a custom songlist file
 python download_karaoke.py --songlist-focus "CCKaraoke" --songlist-file "data/my_custom_songlist.json"
 # Focus on specific playlists from a custom file with force mode
 python download_karaoke.py --songlist-focus "CCKaraoke" --songlist-file "data/my_custom_songlist.json" --force
 # Force download from channels regardless of existing files or server duplicates
 python download_karaoke.py --songlist-focus "2025 - Apple Top 50" --force
 # Force download with parallel processing
 python download_karaoke.py --parallel --workers 5 --songlist-focus "2025 - Apple Top 50" --force --limit 10
 # Prioritize songlist songs in download queue (default behavior)
 python download_karaoke.py --songlist-priority https://www.youtube.com/@SingKingKaraoke/videos
@ -132,35 +60,6 @@ python download_karaoke.py --no-songlist-priority https://www.youtube.com/@SingK
 # Show songlist download status and statistics
 python download_karaoke.py --songlist-status
 ## 📊 UNMATCHED SONGS REPORTS
 # Generate report of songs that couldn't be found in any channel (standalone)
 python download_karaoke.py --generate-unmatched-report
 # Generate report with fuzzy matching enabled (standalone)
 python download_karaoke.py --generate-unmatched-report --fuzzy-match --fuzzy-threshold 85
 # Generate report using a specific channel file (standalone)
 python download_karaoke.py --generate-unmatched-report --file data/my_channels.txt
 # Generate report from a custom songlist file (standalone)
 python download_karaoke.py --generate-unmatched-report --songlist-file "data/my_custom_songlist.json"
 # Generate report with focus on specific playlists from a custom file (standalone)
 python download_karaoke.py --songlist-focus "CCKaraoke" --songlist-file "data/my_custom_songlist.json" --generate-unmatched-report
 # Download songs AND generate unmatched report (additive feature)
 python download_karaoke.py --songlist-only --limit 10 --generate-unmatched-report
 # Download with fuzzy matching AND generate unmatched report
 python download_karaoke.py --songlist-only --fuzzy-match --fuzzy-threshold 85 --limit 10 --generate-unmatched-report
 # Download from specific playlists AND generate unmatched report
 python download_karaoke.py --songlist-focus "CCKaraoke" --limit 10 --generate-unmatched-report
 # Generate report with custom fuzzy threshold
 python download_karaoke.py --generate-unmatched-report --fuzzy-match --fuzzy-threshold 80
 ## ⚡ PARALLEL DOWNLOADS (v3.4)
 # Basic parallel downloads (3-5x faster than sequential)
@ -195,7 +94,7 @@ python download_karaoke.py --parallel --workers 3 --latest-per-channel --limit 5
 python download_karaoke.py --parallel --workers 3 --latest-per-channel --limit 5 --fuzzy-match --fuzzy-threshold 85
 # Download latest videos from specific channels file
-python download_karaoke.py --latest-per-channel --limit 5 --file data/channels.json
+python download_karaoke.py --latest-per-channel --limit 5 --file data/channels.txt
 ## 🔄 CACHE & TRACKING MANAGEMENT
@ -254,7 +153,7 @@ python download_karaoke.py --version
 python download_karaoke.py --songlist-only --limit 20 --fuzzy-match --fuzzy-threshold 85 --resolution 1080p
 # Latest videos per channel with fuzzy matching
-python download_karaoke.py --latest-per-channel --limit 3 --fuzzy-match --fuzzy-threshold 90 --file data/channels.json
+python download_karaoke.py --latest-per-channel --limit 3 --fuzzy-match --fuzzy-threshold 90 --file data/channels.txt
 # Force refresh everything and download songlist
 python download_karaoke.py --songlist-only --force-download-plan --refresh --limit 10
@ -273,9 +172,6 @@ python download_karaoke.py --parallel --workers 5 --songlist-only --limit 10
 # 1b. Focus on specific playlists (fast targeted download)
 python download_karaoke.py --songlist-focus "2025 - Apple Top 50" --limit 5
 # 1c. Force download from specific playlists (bypass all existing file checks)
 python download_karaoke.py --songlist-focus "2025 - Apple Top 50" --force --limit 5
 # 2. Latest videos from all channels
 python download_karaoke.py --latest-per-channel --limit 5
@ -294,9 +190,6 @@ python download_karaoke.py --parallel --workers 5 --songlist-only --fuzzy-match
 # 4b. Focused fuzzy matching (target specific playlists with flexible matching)
 python download_karaoke.py --songlist-focus "2025 - Apple Top 50" --fuzzy-match --fuzzy-threshold 80 --limit 10
 # 4c. Force download with fuzzy matching (bypass all existing file checks)
 python download_karaoke.py --songlist-focus "2025 - Apple Top 50" --force --fuzzy-match --fuzzy-threshold 80 --limit 10
 # 5. Reset and start fresh
 python download_karaoke.py --reset-channel SingKingKaraoke --reset-songlist
@ -304,38 +197,27 @@ python download_karaoke.py --reset-channel SingKingKaraoke --reset-songlist
 python download_karaoke.py --status
 python download_karaoke.py --clear-cache all
-# 7. Download from manual video collection
+## 🌐 PLATFORM SETUP COMMANDS (v3.5)
 python download_karaoke.py --manual --limit 5
-# 7b. Fast parallel manual video download
+# Automatic platform setup (detects OS and installs yt-dlp + FFmpeg)
-python download_karaoke.py --parallel --workers 3 --manual --limit 5
+python setup_platform.py
-# 7c. Manual videos with fuzzy matching
+# Test platform detection and yt-dlp integration
-python download_karaoke.py --manual --fuzzy-match --fuzzy-threshold 85 --limit 10
+python test_platform.py
-## 🍎 macOS SETUP COMMANDS
+# Manual platform-specific setup
-
+# Windows: Download yt-dlp.exe to downloader/ folder
-# Automatic macOS setup (detects OS and installs yt-dlp + FFmpeg)
+# macOS: brew install ffmpeg && pip install yt-dlp
-python3 setup_macos.py
+# Linux: sudo apt install ffmpeg && download yt-dlp to downloader/ folder
 # Test macOS setup and functionality
 python3 src/tests/test_macos.py
 # Manual macOS setup options
 # Install yt-dlp via pip
 pip3 install yt-dlp
 # Download yt-dlp binary for macOS
 mkdir -p downloader && curl -L -o downloader/yt-dlp_macos https://github.com/yt-dlp/yt-dlp/releases/latest/download/yt-dlp_macos && chmod +x downloader/yt-dlp_macos
 # Install FFmpeg via Homebrew
 brew install ffmpeg
 ## 🔧 TROUBLESHOOTING COMMANDS
 # Check if everything is working
 python download_karaoke.py --version
 # Test platform setup
 python test_platform.py
 # Force refresh everything
 python download_karaoke.py --force-download-plan --refresh --clear-cache all
@ -346,9 +228,7 @@ python download_karaoke.py --clear-server-duplicates
 ## 📝 NOTES
 # Default files used:
-# - data/channels.json (channel configuration with parsing rules, preferred)
+# - data/channels.txt (default channel list for songlist modes)
 # - data/channels.json (channel configuration with parsing rules)
 # - data/manual_videos.json (manual video collection)
 # - data/songList.json (your prioritized song list)
 # - data/config.json (download settings)
@ -357,12 +237,11 @@ python download_karaoke.py --clear-server-duplicates
 # Fuzzy threshold: 0-100 (higher = more strict matching, default 90)
 # The system automatically:
-# - Uses data/channels.json for channel configuration and parsing rules
+# - Uses data/channels.txt if no --file specified in songlist modes
 # - Caches channel data for 24 hours (configurable)
 # - Tracks all downloads in JSON files
 # - Avoids re-downloading existing files
 # - Checks for server duplicates
 # - Supports manual video collection via --manual parameter
 # For best performance:
 # - Use --parallel --workers 5 for 3-5x faster downloads
@ -370,7 +249,8 @@ python download_karaoke.py --clear-server-duplicates
 # - Use --fuzzy-match for better song discovery
 # - Use --refresh sparingly (forces re-scan)
 # - Clear cache if you encounter issues
-# - macOS users: Run `python3 setup_macos.py` for automatic setup
+# - Channel caching provides instant video list loading (no YouTube API calls)
 # - FFmpeg integration ensures optimal video processing and merging
 # Parallel download tips:
 # - Start with --workers 3 for conservative approach
--- a/data/bak_songList.json
+++ b/data/bak_songList.json
--- a/data/channel_cache.json
+++ b/data/channel_cache.json
--- a/data/channel_cache/@KaraokeOnVEVO.json
+++ b/data/channel_cache/@KaraokeOnVEVO.json
--- a/data/channel_cache/@KaraokeOnVEVO_raw_output.txt
+++ b/data/channel_cache/@KaraokeOnVEVO_raw_output.txt
--- a/data/channel_cache/@LetsSingKaraoke.json
+++ b/data/channel_cache/@LetsSingKaraoke.json
@ -1,19 +0,0 @@
 {
  "channel_id": "@LetsSingKaraoke",
  "videos": [
    {
      "title": "Sub Urban - Cradles | Karaoke (instrumental)",
      "id": "8uj7IzhdiO4"
    },
    {
      "title": "Sia - Snowman | Karaoke (instrumental)",
      "id": "ZbWHuncTgsM"
    },
    {
      "title": "Trevor Daniel - Falling | Karaoke (Instrumental)",
      "id": "nU7n2aq7f98"
    }
  ],
  "last_updated": "2025-08-05T15:59:09.280488",
  "video_count": 3
 }
--- a/data/channel_cache/@LetsSingKaraoke_raw_output.txt
+++ b/data/channel_cache/@LetsSingKaraoke_raw_output.txt
@ -1,10 +0,0 @@
 # Raw yt-dlp output for @LetsSingKaraoke
 # Channel URL: https://www.youtube.com/@LetsSingKaraoke/videos
 # Command: downloader/yt-dlp_macos --flat-playlist --print %(title)s|%(id)s|%(url)s --verbose https://www.youtube.com/@LetsSingKaraoke/videos
 # Timestamp: 2025-08-05T15:59:09.280155
 # Total lines: 3
 ################################################################################
     1: Sub Urban - Cradles | Karaoke (instrumental)|8uj7IzhdiO4|https://www.youtube.com/watch?v=8uj7IzhdiO4
     2: Sia - Snowman | Karaoke (instrumental)|ZbWHuncTgsM|https://www.youtube.com/watch?v=ZbWHuncTgsM
     3: Trevor Daniel - Falling | Karaoke (Instrumental)|nU7n2aq7f98|https://www.youtube.com/watch?v=nU7n2aq7f98
--- a/data/channel_cache/@SingKingKaraoke.json
+++ b/data/channel_cache/@SingKingKaraoke.json
--- a/data/channel_cache/@SingKingKaraoke_raw_output.txt
+++ b/data/channel_cache/@SingKingKaraoke_raw_output.txt
--- a/data/channel_cache/@StingrayKaraoke.json
+++ b/data/channel_cache/@StingrayKaraoke.json
--- a/data/channel_cache/@StingrayKaraoke_raw_output.txt
+++ b/data/channel_cache/@StingrayKaraoke_raw_output.txt
--- a/data/channel_cache/@VocalStarKaraoke.json
+++ b/data/channel_cache/@VocalStarKaraoke.json
--- a/data/channel_cache/@VocalStarKaraoke_raw_output.txt
+++ b/data/channel_cache/@VocalStarKaraoke_raw_output.txt
--- a/data/channel_cache/@ZoomKaraokeOfficial.json
+++ b/data/channel_cache/@ZoomKaraokeOfficial.json
--- a/data/channel_cache/@ZoomKaraokeOfficial_raw_output.txt
+++ b/data/channel_cache/@ZoomKaraokeOfficial_raw_output.txt
--- a/data/channel_cache/@sing2karaoke.json
+++ b/data/channel_cache/@sing2karaoke.json
--- a/data/channel_cache/@sing2karaoke_raw_output.txt
+++ b/data/channel_cache/@sing2karaoke_raw_output.txt
--- a/data/channels.json
+++ b/data/channels.json
@ -1,191 +0,0 @@
 {
  "channels": [
    {
      "name": "@SingKingKaraoke",
      "url": "https://www.youtube.com/@SingKingKaraoke/videos",
      "parsing_rules": {
        "format": "artist_title_separator",
        "separator": " - ",
        "artist_first": true,
        "title_cleanup": {
          "remove_suffix": {
            "suffixes": ["(Karaoke)", "(Karaoke Version)", "Karaoke Version"]
          }
        },
        "examples": [
          "Artist - Title (Karaoke)",
          "Artist - Title (Karaoke Version)"
        ]
      },
      "description": "Standard artist - title format with karaoke suffix"
    },
    {
      "name": "@KaraokeOnVEVO",
      "url": "https://www.youtube.com/@KaraokeOnVEVO/videos",
      "parsing_rules": {
        "format": "artist_title_separator",
        "separator": " - ",
        "artist_first": true,
        "title_cleanup": {
          "remove_suffix": {
            "suffixes": ["(Karaoke)"]
          }
        },
        "examples": [
          "George Jones - A Picture Of Me (Without You) (Karaoke)",
          "Iggy Pop, Kate Pierson - Candy (Karaoke)"
        ]
      },
      "description": "Standard artist - title format with (Karaoke) suffix"
    },
    {
      "name": "@StingrayKaraoke",
      "url": "https://www.youtube.com/@StingrayKaraoke/videos",
      "parsing_rules": {
        "format": "artist_title_separator",
        "separator": " - ",
        "artist_first": true,
        "title_cleanup": {
          "remove_suffix": {
            "suffixes": ["(Karaoke Version)"]
          }
        },
        "playlist_indicators": [
          "TOP SONGS OF",
          "THE BEST",
          "BEST",
          "NON-STOP",
          "MASHUP",
          "FEAT.",
          "WITH LYRICS"
        ],
        "examples": [
          "Gracie Abrams - That's So True (Karaoke Version)",
          "TOP SONGS OF 2024 KARAOKE WITH LYRICS BY BILLIE EILISH, GRACIE ABRAMS & MORE"
        ]
      },
      "description": "Standard artist - title format with (Karaoke Version) suffix, also has playlist titles"
    },
    {
      "name": "@sing2karaoke",
      "url": "https://www.youtube.com/@sing2karaoke/videos",
      "parsing_rules": {
        "format": "artist_title_spaces",
        "separator": "   ",
        "artist_first": true,
        "title_cleanup": {
          "remove_suffix": {
            "suffixes": ["(Karaoke Version) Lyrics", "(Karaoke Version)", "Karaoke Version Lyrics"]
          }
        },
        "multi_artist_separator": ", ",
        "examples": [
          "Lauren Spencer Smith  Fingers Crossed",
          "Calvin Harris, Clementine Douglas  Blessings (Karaoke Version) Lyrics"
        ]
      },
      "description": "Artist and title separated by multiple spaces, supports multiple artists"
    },
    {
      "name": "@ZoomKaraokeOfficial",
      "url": "https://www.youtube.com/@ZoomKaraokeOfficial/videos",
      "parsing_rules": {
        "format": "artist_title_separator",
        "separator": " - ",
        "artist_first": true,
        "title_cleanup": {
          "remove_suffix": {
            "suffixes": [
              "(Karaoke)", 
              "(Karaoke Version)", 
              "Karaoke Version", 
              "- Karaoke Version from Zoom Karaoke",
              "- Karaoke Version from Zoom",
              "- Karaoke Version from Zoom Karaoke (Radiohead Cover)",
              "- Karaoke Version from Zoom (Radiohead Cover)"
            ]
          }
        },
        "examples": [
          "The Mavericks - Here Comes My Baby - Karaoke Version from Zoom Karaoke"
        ]
      },
      "description": "Standard artist - title format with '- Karaoke Version from Zoom Karaoke' suffix"
    },
    {
      "name": "@VocalStarKaraoke",
      "url": "https://www.youtube.com/@VocalStarKaraoke/videos",
      "parsing_rules": {
        "format": "artist_title_separator",
        "separator": " - ",
        "artist_first": false,
        "title_cleanup": {
          "remove_suffix": {
            "suffixes": ["KARAOKE Without Backing Vocals", "KARAOKE With Vocal Guide", "KARAOKE"]
          }
        },
        "examples": [
          "Don't Say You Love Me - Jin KARAOKE Without Backing Vocals",
          "Don't Say You Love Me - Jin KARAOKE With Vocal Guide"
        ]
      },
      "description": "Title first, then dash separator, then artist with KARAOKE suffix"
    },
    {
      "name": "@ManualVideos",
      "url": "manual://static",
      "manual_videos_file": "data/manual_videos.json",
      "parsing_rules": {
        "format": "artist_title_separator",
        "separator": " - ",
        "artist_first": true,
        "title_cleanup": {
          "remove_suffix": {
            "suffixes": ["(Karaoke)", "(Karaoke Version)", "(Karaoke Version) Lyrics"]
          }
        }
      },
      "description": "Manual collection of individual karaoke videos (static, never expires)"
    },
    {
      "name": "Let's Sing Karaoke",
      "url": "https://www.youtube.com/@LetsSingKaraoke/videos",
      "parsing_rules": {
        "format": "artist_title_separator",
        "separator": " - ",
        "artist_first": true,
        "title_cleanup": {
          "remove_suffix": {
            "suffixes": ["(Karaoke)", "(Karaoke Version)", "Karaoke Version", "(In the style of)"]
          }
        },
        "examples": [
          "Artist - Title (Karaoke)",
          "Artist - Title (In the style of Other Artist)"
        ]
      },
      "artist_name_processing": true,
      "description": "Let's Sing Karaoke with enhanced artist name processing"
    }
  ],
  "global_parsing_settings": {
    "fallback_format": "artist_title_separator",
    "fallback_separator": " - ",
    "common_suffixes": [
      "(Karaoke)",
      "(Karaoke Version)",
      "Karaoke Version",
      "(Karaoke Version) Lyrics",
      "Karaoke Version Lyrics"
    ],
    "playlist_indicators": [
      "TOP",
      "BEST",
      "MASHUP",
      "FEAT.",
      "WITH LYRICS",
      "NON-STOP",
      "PLAYLIST"
    ]
  }
 } 
--- a/data/channels.txt
+++ b/data/channels.txt
@ -0,0 +1,7 @@
 https://www.youtube.com/@SingKingKaraoke/videos 
 https://www.youtube.com/@karafun/videos
 https://www.youtube.com/@KaraokeOnVEVO/videos
 https://www.youtube.com/@StingrayKaraoke/videos
 https://www.youtube.com/@CCKaraoke/videos
 https://www.youtube.com/@AtomicKaraoke/videos
 https://www.youtube.com/@sing2karaoke/videos
--- a/utilities/cleanup_recent_tracking.py
+++ b/utilities/cleanup_recent_tracking.py
@ -2,11 +2,7 @@ import json
 from pathlib import Path
 from datetime import datetime, time
-from karaoke_downloader.data_path_manager import get_data_path_manager
+def cleanup_recent_tracking(tracking_path="data/songlist_tracking.json", cutoff_time_str="11:00"):
 def cleanup_recent_tracking(tracking_path=None, cutoff_time_str="11:00"):
    if tracking_path is None:
        tracking_path = str(get_data_path_manager().get_songlist_tracking_path())
    """Remove entries from songlist_tracking.json that were added after the specified time today."""
    tracking_file = Path(tracking_path)
    if not tracking_file.exists():
--- a/config/config.json
+++ b/config/config.json
@ -19,14 +19,13 @@
    "writethumbnail": false,
    "embed_metadata": false,
    "continuedl": true,
-    "nooverwrites": false,
+    "nooverwrites": true,
    "ignoreerrors": true,
    "no_warnings": false
 },
  "folder_structure": {
    "downloads_dir": "downloads",
    "logs_dir": "logs",
    "data_dir": "data",
    "tracking_file": "downloaded_videos.json"
  },
  "logging": {
@ -39,7 +38,8 @@
    "auto_detect_platform": true,
    "yt_dlp_paths": {
      "windows": "downloader/yt-dlp.exe",
-      "macos": "downloader/yt-dlp_macos"
+      "macos": "python3 -m yt_dlp",
      "linux": "downloader/yt-dlp"
    }
  },
  "yt_dlp_path": "downloader/yt-dlp.exe"
--- a/utilities/deduplicate_songlist_tracking.py
+++ b/utilities/deduplicate_songlist_tracking.py
--- a/data/karaoke_tracking.json
+++ b/data/karaoke_tracking.json
--- a/data/manual_videos.json
+++ b/data/manual_videos.json
@ -1,85 +0,0 @@
 {
  "channel_name": "@ManualVideos",
  "channel_url": "manual://static",
  "description": "Manual collection of individual karaoke videos",
  "videos": [
    {
      "title": "Nickelback - Photograph",
      "url": "https://www.youtube.com/watch?v=qZXwpceqt9s",
      "id": "qZXwpceqt9s",
      "upload_date": "2024-01-01",
      "duration": 180,
      "view_count": 1000
    },
    {
      "title": "Ed Sheeran & Beyoncé - Perfect Duet",
      "url": "https://www.youtube.com/watch?v=qegLWI99Wg0",
      "id": "qegLWI99Wg0",
      "upload_date": "2024-01-01",
      "duration": 180,
      "view_count": 1000
    },
    {
      "title": "10,000 Maniacs - More Than This",
      "url": "https://www.youtube.com/watch?v=wxnuF-APJ5M",
      "id": "wxnuF-APJ5M",
      "upload_date": "2024-01-01",
      "duration": 180,
      "view_count": 1000
    },
    {
      "title": "AC/DC - Big Balls",
      "url": "https://www.youtube.com/watch?v=kiSDpVmu4Bk",
      "id": "kiSDpVmu4Bk",
      "upload_date": "2024-01-01",
      "duration": 180,
      "view_count": 1000
    },
    {
      "title": "Jon Bon Jovi - Blaze of Glory",
      "url": "https://www.youtube.com/watch?v=SzRAoDMlQY",
      "id": "SzRAoDMlQY",
      "upload_date": "2024-01-01",
      "duration": 180,
      "view_count": 1000
    },
    {
      "title": "ZZ Top - Sharp Dressed Man",
      "url": "https://www.youtube.com/watch?v=prRalwto9iY",
      "id": "prRalwto9iY",
      "upload_date": "2024-01-01",
      "duration": 180,
      "view_count": 1000
    },
    {
      "title": "Nickelback - Photograph",
      "url": "https://www.youtube.com/watch?v=qTphCTAUhUg",
      "id": "qTphCTAUhUg",
      "upload_date": "2024-01-01",
      "duration": 180,
      "view_count": 1000
    },
    {
      "title": "Billy Joel - Shes Got A Way",
      "url": "https://www.youtube.com/watch?v=DeeTFIgKuC8",
      "id": "DeeTFIgKuC8",
      "upload_date": "2024-01-01",
      "duration": 180,
      "view_count": 1000
    }
  ],
  "parsing_rules": {
    "format": "artist_title_separator",
    "separator": " - ",
    "artist_first": true,
    "title_cleanup": {
      "remove_suffix": {
        "suffixes": [
          "(Karaoke)",
          "(Karaoke Version)",
          "(Karaoke Version) Lyrics"
        ]
      }
    }
  }
 }
--- a/data/server_duplicates_tracking.json
+++ b/data/server_duplicates_tracking.json
--- a/data/songList.json
+++ b/data/songList.json
@ -23902,7 +23902,7 @@
        "title": "Superman (It's Not Easy)"
      },
      {
-        "artist": "'NSync",
+        "artist": "'N Sync",
        "position": 16,
        "title": "Gone"
      },
@ -24122,7 +24122,7 @@
        "title": "Turn Off The Light"
      },
      {
-        "artist": "'NSync",
+        "artist": "'N Sync",
        "position": 13,
        "title": "Gone"
      },
@ -24617,7 +24617,7 @@
        "title": "Most Girls"
      },
      {
-        "artist": "'NSync",
+        "artist": "'N Sync",
        "position": 11,
        "title": "This I Promise You"
      },
@ -24857,7 +24857,7 @@
        "title": "I Just Wanna Love U (Give It 2 Me)"
      },
      {
-        "artist": "'NSync",
+        "artist": "'N Sync",
        "position": 12,
        "title": "This I Promise You"
      },
@ -25857,7 +25857,7 @@
        "title": "Tha Block Is Hot"
      },
      {
-        "artist": "'NSync & Gloria Estefan",
+        "artist": "'N Sync & Gloria Estefan",
        "position": 85,
        "title": "Music Of My Heart"
      },
@ -26237,7 +26237,7 @@
        "title": "Touch It"
      },
      {
-        "artist": "NSync",
+        "artist": "N Sync",
        "position": 34,
        "title": "(God Must Have Spent) A Little More Time On You"
      },
--- a/utilities/songlist_report.py
+++ b/utilities/songlist_report.py
@ -1,15 +1,11 @@
 import json
 from pathlib import Path
 from karaoke_downloader.data_path_manager import get_data_path_manager
 def normalize_title(title):
    normalized = title.replace("(Karaoke Version)", "").replace("(Karaoke)", "").strip()
    return " ".join(normalized.split()).lower()
-def load_songlist(songlist_path=None):
+def load_songlist(songlist_path="data/songList.json"):
    if songlist_path is None:
        songlist_path = str(get_data_path_manager().get_songlist_path())
    songlist_file = Path(songlist_path)
    if not songlist_file.exists():
        print(f"⚠️ Songlist file not found: {songlist_path}")
@ -28,18 +24,14 @@ def load_songlist(songlist_path=None):
                    })
    return all_songs
-def load_songlist_tracking(tracking_path=None):
+def load_songlist_tracking(tracking_path="data/songlist_tracking.json"):
    if tracking_path is None:
        tracking_path = str(get_data_path_manager().get_songlist_tracking_path())
    tracking_file = Path(tracking_path)
    if not tracking_file.exists():
        return {}
    with open(tracking_file, 'r', encoding='utf-8') as f:
        return json.load(f)
-def load_server_songs(songs_path=None):
+def load_server_songs(songs_path="data/songs.json"):
    if songs_path is None:
        songs_path = str(get_data_path_manager().get_songs_path())
    """Load the list of songs already available on the server."""
    songs_file = Path(songs_path)
    if not songs_file.exists():
--- a/data/songlist_tracking.json
+++ b/data/songlist_tracking.json
--- a/downloader/yt-dlp
+++ b/downloader/yt-dlp
--- a/utilities/fix_code_quality.py
+++ b/utilities/fix_code_quality.py
--- a/karaoke_downloader/cache_manager.py
+++ b/karaoke_downloader/cache_manager.py
@ -9,8 +9,6 @@ import json
 from datetime import datetime, timedelta
 from pathlib import Path
 from karaoke_downloader.data_path_manager import get_data_path_manager
 # Constants
 DEFAULT_CACHE_EXPIRATION_DAYS = 1
 DEFAULT_CACHE_FILENAME_LENGTH_LIMIT = 200  # Increased from 60
@ -39,7 +37,7 @@ def get_download_plan_cache_file(mode, **kwargs):
            + hashlib.md5(base.encode()).hexdigest()[:8]
        )
-    return get_data_path_manager().get_path(f"{base}.json")
+    return Path(f"data/{base}.json")
 def load_cached_plan(cache_file, max_age_days=DEFAULT_CACHE_EXPIRATION_DAYS):
--- a/karaoke_downloader/channel_parser.py
+++ b/karaoke_downloader/channel_parser.py
@ -1,260 +0,0 @@
 """
 Channel-specific parsing utilities for extracting artist and title from video titles.
 This module handles the different title formats used by various karaoke channels,
 providing channel-specific parsing rules to extract artist and title information
 correctly for ID3 tagging and filename generation.
 """
 import json
 import re
 from typing import Dict, List, Optional, Tuple, Any
 from pathlib import Path
 from karaoke_downloader.data_path_manager import get_data_path_manager
 class ChannelParser:
    """Handles channel-specific parsing of video titles to extract artist and title."""
    def __init__(self, channels_file: str = None):
        if channels_file is None:
            channels_file = str(get_data_path_manager().get_channels_json_path())
        """Initialize the parser with channel configuration."""
        self.channels_file = Path(channels_file)
        self.channels_config = self._load_channels_config()
    def _load_channels_config(self) -> Dict[str, Any]:
        """Load the channels configuration from JSON file."""
        if not self.channels_file.exists():
            raise FileNotFoundError(f"Channels configuration file not found: {self.channels_file}")
        with open(self.channels_file, 'r', encoding='utf-8') as f:
            return json.load(f)
    def get_channel_config(self, channel_name: str) -> Optional[Dict[str, Any]]:
        """Get the configuration for a specific channel."""
        for channel in self.channels_config.get("channels", []):
            if channel["name"] == channel_name:
                return channel
        return None
    def extract_artist_title(self, video_title: str, channel_name: str) -> Tuple[str, str]:
        """
        Extract artist and title from a video title using channel-specific parsing rules.
        Args:
            video_title: The full video title from YouTube
            channel_name: The name of the channel (must match config)
        Returns:
            Tuple of (artist, title) - both may be empty strings if parsing fails
        """
        channel_config = self.get_channel_config(channel_name)
        if not channel_config:
            # Fallback to global settings
            return self._fallback_parse(video_title)
        parsing_rules = channel_config.get("parsing_rules", {})
        format_type = parsing_rules.get("format", "artist_title_separator")
        if format_type == "artist_title_separator":
            return self._parse_artist_title_separator(video_title, parsing_rules)
        elif format_type == "artist_title_spaces":
            return self._parse_artist_title_spaces(video_title, parsing_rules)
        elif format_type == "title_artist_pipe":
            return self._parse_title_artist_pipe(video_title, parsing_rules)
        else:
            return self._fallback_parse(video_title)
    def _parse_artist_title_separator(self, video_title: str, rules: Dict[str, Any]) -> Tuple[str, str]:
        """Parse format: 'Artist - Title' or 'Title - Artist'."""
        separator = rules.get("separator", " - ")
        artist_first = rules.get("artist_first", True)
        if separator not in video_title:
            return "", video_title.strip()
        parts = video_title.split(separator, 1)
        if len(parts) != 2:
            return "", video_title.strip()
        part1, part2 = parts[0].strip(), parts[1].strip()
        # Apply cleanup to both parts
        part1_clean = self._cleanup_title(part1, rules.get("title_cleanup", {}))
        part2_clean = self._cleanup_title(part2, rules.get("title_cleanup", {}))
        if artist_first:
            return part1_clean, part2_clean
        else:
            return part2_clean, part1_clean
    def _parse_artist_title_spaces(self, video_title: str, rules: Dict[str, Any]) -> Tuple[str, str]:
        """Parse format: 'Artist   Title' (multiple spaces)."""
        separator = rules.get("separator", "   ")
        multi_artist_sep = rules.get("multi_artist_separator", ",  ")
        # Try multiple space patterns to handle inconsistent spacing
        # Look for the LAST occurrence of multiple spaces to handle cases with commas
        space_patterns = ["   ", "  ", "    "]  # 3, 2, 4 spaces
        for pattern in space_patterns:
            if pattern in video_title:
                # Split on the LAST occurrence of the pattern
                last_index = video_title.rfind(pattern)
                if last_index != -1:
                    artist_part = video_title[:last_index].strip()
                    title_part = video_title[last_index + len(pattern):].strip()
                    # Handle multiple artists (e.g., "Artist1,  Artist2")
                    if multi_artist_sep in artist_part:
                        # Keep the full artist string as is
                        artist = artist_part
                    else:
                        artist = artist_part
                    title = self._cleanup_title(title_part, rules.get("title_cleanup", {}))
                    return artist, title
        # Try dash patterns as fallback for inconsistent formatting
        dash_patterns = [" - ", " – ", " -"]  # Regular dash, en dash, dash without trailing space
        for pattern in dash_patterns:
            if pattern in video_title:
                # Split on the LAST occurrence of the pattern
                last_index = video_title.rfind(pattern)
                if last_index != -1:
                    artist_part = video_title[:last_index].strip()
                    title_part = video_title[last_index + len(pattern):].strip()
                    # Handle multiple artists (e.g., "Artist1,  Artist2")
                    if multi_artist_sep in artist_part:
                        # Keep the full artist string as is
                        artist = artist_part
                    else:
                        artist = artist_part
                    title = self._cleanup_title(title_part, rules.get("title_cleanup", {}))
                    return artist, title
        # If no pattern matches, return empty artist and full title
        return "", video_title.strip()
    def _parse_title_artist_pipe(self, video_title: str, rules: Dict[str, Any]) -> Tuple[str, str]:
        """Parse format: 'Title | Artist'."""
        separator = rules.get("separator", " | ")
        if separator not in video_title:
            return "", video_title.strip()
        parts = video_title.split(separator, 1)
        if len(parts) != 2:
            return "", video_title.strip()
        title_part, artist_part = parts[0].strip(), parts[1].strip()
        title = self._cleanup_title(title_part, rules.get("title_cleanup", {}))
        artist = self._cleanup_title(artist_part, rules.get("artist_cleanup", {}))
        return artist, title
    def _cleanup_title(self, text: str, cleanup_rules: Dict[str, Any]) -> str:
        """Apply cleanup rules to remove suffixes and normalize text."""
        if not cleanup_rules:
            return text.strip()
        cleaned = text.strip()
        # Handle remove_suffix rule
        if "remove_suffix" in cleanup_rules:
            suffixes = cleanup_rules["remove_suffix"].get("suffixes", [])
            for suffix in suffixes:
                if cleaned.endswith(suffix):
                    cleaned = cleaned[:-len(suffix)].strip()
                    break
        return cleaned
    def _fallback_parse(self, video_title: str) -> Tuple[str, str]:
        """Fallback parsing using global settings."""
        global_settings = self.channels_config.get("global_parsing_settings", {})
        fallback_format = global_settings.get("fallback_format", "artist_title_separator")
        fallback_separator = global_settings.get("fallback_separator", " - ")
        if fallback_format == "artist_title_separator":
            if fallback_separator in video_title:
                parts = video_title.split(fallback_separator, 1)
                if len(parts) == 2:
                    artist = parts[0].strip()
                    title = parts[1].strip()
                    # Apply global suffix cleanup
                    for suffix in global_settings.get("common_suffixes", []):
                        if title.endswith(suffix):
                            title = title[:-len(suffix)].strip()
                            break
                    return artist, title
        # If all else fails, return empty artist and full title
        return "", video_title.strip()
    def is_playlist_title(self, video_title: str, channel_name: str) -> bool:
        """Check if a video title appears to be a playlist rather than a single song."""
        channel_config = self.get_channel_config(channel_name)
        if not channel_config:
            return self._is_playlist_by_global_rules(video_title)
        parsing_rules = channel_config.get("parsing_rules", {})
        playlist_indicators = parsing_rules.get("playlist_indicators", [])
        if not playlist_indicators:
            return self._is_playlist_by_global_rules(video_title)
        title_upper = video_title.upper()
        for indicator in playlist_indicators:
            if indicator.upper() in title_upper:
                return True
        return False
    def _is_playlist_by_global_rules(self, video_title: str) -> bool:
        """Check if title is a playlist using global rules."""
        global_settings = self.channels_config.get("global_parsing_settings", {})
        playlist_indicators = global_settings.get("playlist_indicators", [])
        title_upper = video_title.upper()
        for indicator in playlist_indicators:
            if indicator.upper() in title_upper:
                return True
        return False
    def get_all_channel_names(self) -> List[str]:
        """Get a list of all configured channel names."""
        return [channel["name"] for channel in self.channels_config.get("channels", [])]
    def get_channel_url(self, channel_name: str) -> Optional[str]:
        """Get the URL for a specific channel."""
        channel_config = self.get_channel_config(channel_name)
        return channel_config.get("url") if channel_config else None
 # Convenience function for backward compatibility
 def extract_artist_title(video_title: str, channel_name: str, channels_file: str = None) -> Tuple[str, str]:
    if channels_file is None:
        channels_file = str(get_data_path_manager().get_channels_json_path())
    """
    Convenience function to extract artist and title from a video title.
    Args:
        video_title: The full video title from YouTube
        channel_name: The name of the channel
        channels_file: Path to the channels configuration file
    Returns:
        Tuple of (artist, title)
    """
    parser = ChannelParser(channels_file)
    return parser.extract_artist_title(video_title, channel_name) 
--- a/karaoke_downloader/cli.py
+++ b/karaoke_downloader/cli.py
@ -1,117 +1,27 @@
 #!/usr/bin/env python3
 """
 Karaoke Video Downloader CLI
 Command-line interface for the karaoke video downloader.
 """
 import argparse
 import os
 import sys
 from pathlib import Path
 from typing import List
-from karaoke_downloader.channel_parser import ChannelParser
+from pathlib import Path
-from karaoke_downloader.config_manager import AppConfig
+
 from karaoke_downloader.data_path_manager import get_data_path_manager
 from karaoke_downloader.downloader import KaraokeDownloader
 # Constants
 DEFAULT_LATEST_PER_CHANNEL_LIMIT = 10
 DEFAULT_FUZZY_THRESHOLD = 85
-
+DEFAULT_LATEST_PER_CHANNEL_LIMIT = 5
-
+DEFAULT_DISPLAY_LIMIT = 10
-def load_channels_from_json(channels_file: str = None) -> List[str]:
+DEFAULT_CACHE_DURATION_HOURS = 24
    """
    Load channel URLs from the new JSON format.
    Args:
        channels_file: Path to the channels.json file (if None, uses default from config)
    Returns:
        List of channel URLs
    """
    if channels_file is None:
        channels_file = str(get_data_path_manager().get_channels_json_path())
    try:
        parser = ChannelParser(channels_file)
        channels = parser.channels_config.get("channels", [])
        return [channel["url"] for channel in channels]
    except Exception as e:
        print(f"❌ Error loading channels from {channels_file}: {e}")
        return []
 def load_channels_from_text(channels_file: str = None) -> List[str]:
    """
    Load channel URLs from the old text format (for backward compatibility).
    Args:
        channels_file: Path to the channels.txt file (if None, uses default from config)
    Returns:
        List of channel URLs
    """
    if channels_file is None:
        channels_file = str(get_data_path_manager().get_channels_txt_path())
    try:
        with open(channels_file, "r", encoding="utf-8") as f:
            return [
                line.strip()
                for line in f
                if line.strip() and not line.strip().startswith("#")
            ]
    except Exception as e:
        print(f"❌ Error loading channels from {channels_file}: {e}")
        return []
 def load_channels(channel_file: str = None) -> List[str]:
    """Load channel URLs from file."""
    if channel_file is None:
        # Use JSON configuration
        data_path_manager = get_data_path_manager()
        if data_path_manager.file_exists("channels.json"):
            return load_channels_from_json()
        else:
            return []
    else:
        if channel_file.endswith(".json"):
            return load_channels_from_json(channel_file)
        else:
            return load_channels_from_text(channel_file)
 def get_channel_url_by_name(channel_name: str) -> str:
    """Look up a channel URL by its name from the channels configuration."""
    channel_urls = load_channels()
    # Normalize the channel name for comparison
    normalized_name = channel_name.lower().replace("@", "").replace("karaoke", "").strip()
    for url in channel_urls:
        # Extract channel name from URL
        if "/@" in url:
            url_channel_name = url.split("/@")[1].split("/")[0].lower()
            if url_channel_name == normalized_name or url_channel_name.replace("karaoke", "").strip() == normalized_name:
                return url
    return None
 def main():
    parser = argparse.ArgumentParser(
-        description="Karaoke Video Downloader - Download YouTube playlists and channel videos for karaoke (default: downloads latest videos from all channels)",
+        description="Karaoke Video Downloader - Download YouTube playlists and channel videos for karaoke",
        formatter_class=argparse.RawDescriptionHelpFormatter,
        epilog="""
 Examples:
-  python download_karaoke.py --limit 10                    # Download latest 10 videos from all channels
+  python download_karaoke.py https://www.youtube.com/playlist?list=XYZ
-  python download_karaoke.py --songlist-only --limit 10    # Download only songlist songs across channels
+  python download_karaoke.py https://www.youtube.com/@SingKingKaraoke/videos
-  python download_karaoke.py --channel-focus SingKingKaraoke --limit 5  # Download from specific channel
+  python download_karaoke.py --file data/channels.txt
  python download_karaoke.py --channel-focus SingKingKaraoke --all-videos  # Download ALL videos from channel
  python download_karaoke.py https://www.youtube.com/@SingKingKaraoke/videos  # Download from specific channel URL
  python download_karaoke.py --file data/channels.txt      # Download from custom channel list
  python download_karaoke.py --reset-channel SingKingKaraoke --delete-files
        """,
    )
@ -182,34 +92,13 @@ Examples:
    parser.add_argument(
        "--songlist-priority",
        action="store_true",
-        help="Prioritize downloads based on songList.json in the data directory (default: enabled)",
+        help="Prioritize downloads based on data/songList.json (default: enabled)",
    )
    parser.add_argument(
        "--no-songlist-priority",
        action="store_true",
        help="Disable songlist prioritization",
    )
    parser.add_argument(
        "--generate-unmatched-report",
        action="store_true",
        help="Generate a report of songs that couldn't be found in any channel (runs after downloads)",
    )
    parser.add_argument(
        "--show-pagination",
        action="store_true",
        help="Show page-by-page progress when downloading channel video lists (slower but more detailed)",
    )
    parser.add_argument(
        "--parallel-channels",
        action="store_true",
        help="Enable parallel channel scanning for faster channel processing (scans multiple channels simultaneously)",
    )
    parser.add_argument(
        "--channel-workers",
        type=int,
        default=3,
        help="Number of parallel channel scanning workers (default: 3, max: 10)",
    )
    parser.add_argument(
        "--songlist-only",
        action="store_true",
@ -221,16 +110,6 @@ Examples:
        metavar="PLAYLIST_TITLE",
        help='Focus on specific playlists by title (e.g., --songlist-focus "2025 - Apple Top 50" "2024 - Billboard Hot 100")',
    )
    parser.add_argument(
        "--songlist-file",
        metavar="FILE_PATH",
        help="Custom songlist file path to use with --songlist-focus (default: songList.json in the data directory)",
    )
    parser.add_argument(
        "--force",
        action="store_true",
        help="Force download from channels regardless of whether songs are already downloaded, on server, or marked as duplicates",
    )
    parser.add_argument(
        "--songlist-status",
        action="store_true",
@ -267,7 +146,7 @@ Examples:
    parser.add_argument(
        "--latest-per-channel",
        action="store_true",
-        help="Download the latest N videos from each channel (use with --limit) [DEPRECATED: This is now the default behavior]",
+        help="Download the latest N videos from each channel (use with --limit)",
    )
    parser.add_argument(
        "--fuzzy-match",
@ -277,50 +156,19 @@ Examples:
    parser.add_argument(
        "--fuzzy-threshold",
        type=int,
-        default=DEFAULT_FUZZY_THRESHOLD,
+        default=90,
-        help=f"Fuzzy match threshold (0-100, default {DEFAULT_FUZZY_THRESHOLD})",
+        help="Fuzzy match threshold (0-100, default 90)",
    )
    parser.add_argument(
        "--parallel",
        action="store_true",
-        help="Enable parallel downloads for improved speed (3-5x faster for large batches, defaults to 3 workers)",
+        help="Enable parallel downloads for improved speed",
    )
    parser.add_argument(
        "--workers",
        type=int,
        default=3,
-        help="Number of parallel download workers (default: 3, max: 10, only used with --parallel)",
+        help="Number of parallel download workers (default: 3, max: 10)",
    )
    parser.add_argument(
        "--generate-songlist",
        nargs="+",
        metavar="DIRECTORY",
        help="Generate song list from MP4 files with ID3 tags in specified directories",
    )
    parser.add_argument(
        "--no-append-songlist",
        action="store_true",
        help="Create a new song list instead of appending when using --generate-songlist",
    )
    parser.add_argument(
        "--manual",
        action="store_true",
        help="Download from manual videos collection (manual_videos.json in the data directory)",
    )
    parser.add_argument(
        "--channel-focus",
        type=str,
        help="Download from a specific channel by name (e.g., 'SingKingKaraoke')",
    )
    parser.add_argument(
        "--all-videos",
        action="store_true",
        help="Download all videos from channel (not just songlist matches), skipping existing files",
    )
    parser.add_argument(
        "--dry-run",
        action="store_true",
        help="Build download plan and show what would be downloaded without actually downloading anything",
    )
    args = parser.parse_args()
@ -329,11 +177,6 @@ Examples:
        print("❌ Error: --workers must be between 1 and 10")
        sys.exit(1)
    # Validate channel workers argument
    if args.channel_workers < 1 or args.channel_workers > 10:
        print("❌ Error: --channel-workers must be between 1 and 10")
        sys.exit(1)
    # Load configuration to get platform-aware yt-dlp path
    from karaoke_downloader.config_manager import load_config
    config = load_config()
@ -344,12 +187,13 @@ Examples:
        # It's a command string, test if it works
        try:
            import subprocess
-            cmd = yt_dlp_path.split() + ["--version"]
+            from karaoke_downloader.youtube_utils import _parse_yt_dlp_command
            cmd = _parse_yt_dlp_command(yt_dlp_path) + ["--version"]
            result = subprocess.run(cmd, capture_output=True, text=True, timeout=10)
            if result.returncode != 0:
                raise Exception(f"Command failed: {result.stderr}")
        except Exception as e:
-            platform_name = "macOS" if sys.platform == "darwin" else "Windows"
+            platform_name = "macOS" if sys.platform == "darwin" else "Windows" if sys.platform == "win32" else "Linux"
            print(f"❌ Error: yt-dlp command failed: {yt_dlp_path}")
            print(f"Please ensure yt-dlp is properly installed for {platform_name}")
            print(f"Error: {e}")
@ -358,7 +202,7 @@ Examples:
        # It's a file path, check if it exists
        yt_dlp_file = Path(yt_dlp_path)
        if not yt_dlp_file.exists():
-            platform_name = "macOS" if sys.platform == "darwin" else "Windows"
+            platform_name = "macOS" if sys.platform == "darwin" else "Windows" if sys.platform == "win32" else "Linux"
            binary_name = yt_dlp_file.name
            print(f"❌ Error: {binary_name} not found in downloader/ directory")
            print(f"Please ensure {binary_name} is present in the downloader/ folder for {platform_name}")
@ -392,19 +236,9 @@ Examples:
    if args.songlist_focus:
        downloader.songlist_focus_titles = args.songlist_focus
        downloader.songlist_only = True  # Enable songlist-only mode when focusing
        args.songlist_only = True  # Also set the args flag to ensure CLI logic works
        print(
            f"🎯 Songlist focus mode enabled for playlists: {', '.join(args.songlist_focus)}"
        )
    if args.songlist_file:
        downloader.songlist_file_path = args.songlist_file
        print(f"📁 Using custom songlist file: {args.songlist_file}")
    if args.force:
        downloader.force_download = True
        print("💪 Force mode enabled - will download regardless of existing files or server duplicates")
    if args.dry_run:
        downloader.dry_run = True
        print("🔍 Dry run mode enabled - will show download plan without downloading")
    if args.resolution != "720p":
        downloader.config_manager.update_resolution(args.resolution)
@ -418,16 +252,17 @@ Examples:
        sys.exit(0)
    # --- END NEW ---
-    # --- NEW: If no URL or file is provided, but --songlist-only is set, use all channels ---
+    # --- NEW: If no URL or file is provided, but --songlist-only is set, use all channels in data/channels.txt ---
-    if (args.songlist_only or args.songlist_focus) and not args.url and not args.file:
+    if args.songlist_only and not args.url and not args.file:
-        channel_urls = load_channels()
+        channels_file = Path("data/channels.txt")
-        if channel_urls:
+        if channels_file.exists():
            args.file = str(channels_file)
            print(
-                "📋 No URL or --file provided, defaulting to all configured channels for songlist mode."
+                "📋 No URL or --file provided, defaulting to all channels in data/channels.txt for songlist-only mode."
            )
        else:
            print(
-                "❌ No URL, --file, or channel configuration found. Please provide a channel URL or create channels.json in the data directory."
+                "❌ No URL, --file, or data/channels.txt found. Please provide a channel URL or a file with channel URLs."
            )
            sys.exit(1)
    # --- END NEW ---
@ -447,22 +282,6 @@ Examples:
        print("ℹ️  Songs will be re-checked against the server on next run.")
        sys.exit(0)
    if args.generate_songlist:
        from karaoke_downloader.songlist_generator import SongListGenerator
        print("🎵 Generating song list from MP4 files with ID3 tags...")
        generator = SongListGenerator()
        try:
            generator.generate_songlist_from_multiple_directories(
                args.generate_songlist,
                append=not args.no_append_songlist
            )
            print("✅ Song list generation completed successfully!")
        except Exception as e:
            print(f"❌ Error generating song list: {e}")
            sys.exit(1)
        sys.exit(0)
    if args.status:
        stats = downloader.tracker.get_statistics()
        print("🎤 Karaoke Downloader Status")
@ -480,10 +299,9 @@ Examples:
        print("💾 Channel Cache Information")
        print("=" * 40)
        print(f"Total Channels: {cache_info['total_channels']}")
-        print(f"Total Cached Videos: {cache_info['total_videos']}")
+        print(f"Total Cached Videos: {cache_info['total_cached_videos']}")
-        print("\n📋 Channel Details:")
+        print(f"Cache Duration: {cache_info['cache_duration_hours']} hours")
-        for channel in cache_info['channels']:
+        print(f"Last Updated: {cache_info['last_updated']}")
            print(f"   • {channel['channel']}: {channel['videos']} videos (updated: {channel['last_updated']})")
        sys.exit(0)
    elif args.clear_cache:
        if args.clear_cache == "all":
@ -523,243 +341,71 @@ Examples:
            if len(tracking) > 10:
                print(f"   ... and {len(tracking) - 10} more")
        sys.exit(0)
    elif args.manual:
        # Download from manual videos collection
        print("🎤 Downloading from manual videos collection...")
        success = downloader.download_channel_videos(
            "manual://static",
            force_refresh=args.refresh,
            fuzzy_match=args.fuzzy_match,
            fuzzy_threshold=args.fuzzy_threshold,
            force_download=args.force,
        )
    elif args.channel_focus:
        # Download from a specific channel by name
        print(f"🎤 Looking up channel: {args.channel_focus}")
        channel_url = get_channel_url_by_name(args.channel_focus)
        if not channel_url:
            print(f"❌ Channel '{args.channel_focus}' not found in configuration")
            print("Available channels:")
            channel_urls = load_channels()
            for url in channel_urls:
                if "/@" in url:
                    channel_name = url.split("/@")[1].split("/")[0]
                    print(f"   • {channel_name}")
            sys.exit(1)
        if args.all_videos:
            # Download ALL videos from the channel (not just songlist matches)
            print(f"🎤 Downloading ALL videos from channel: {args.channel_focus} ({channel_url})")
            success = downloader.download_all_channel_videos(
                channel_url,
                force_refresh=args.refresh,
                force_download=args.force,
                limit=args.limit,
                dry_run=args.dry_run,
            )
        else:
            # Download only songlist matches from the channel
            print(f"🎤 Downloading from channel: {args.channel_focus} ({channel_url})")
            success = downloader.download_channel_videos(
                channel_url,
                force_refresh=args.refresh,
                fuzzy_match=args.fuzzy_match,
                fuzzy_threshold=args.fuzzy_threshold,
                force_download=args.force,
                dry_run=args.dry_run,
            )
    elif args.songlist_only or args.songlist_focus:
-        # Use provided file or default to channels configuration
+        # Use provided file or default to data/channels.txt
-        channel_urls = load_channels(args.file)
+        channel_file = args.file if args.file else "data/channels.txt"
-        if not channel_urls:
+        if not os.path.exists(channel_file):
-            print(f"❌ No channels found in configuration")
+            print(f"❌ Channel file not found: {channel_file}")
            sys.exit(1)
-        limit = args.limit if args.limit else None
+        with open(channel_file, "r", encoding="utf-8") as f:
        success = downloader.download_songlist_across_channels(
            channel_urls,
            limit=args.limit,
            force_refresh_download_plan=args.force_download_plan if hasattr(args, "force_download_plan") else False,
            fuzzy_match=args.fuzzy_match,
            fuzzy_threshold=args.fuzzy_threshold,
            force_download=args.force,
            show_pagination=args.show_pagination,
            parallel_channels=args.parallel_channels,
            max_channel_workers=args.channel_workers,
            dry_run=args.dry_run,
        )
    elif args.latest_per_channel:
        # Use provided file or default to channels configuration
        channel_urls = load_channels(args.file)
        if not channel_urls:
            print(f"❌ No channels found in configuration")
            sys.exit(1)
        limit = args.limit if args.limit else DEFAULT_LATEST_PER_CHANNEL_LIMIT
        force_refresh_download_plan = (
            args.force_download_plan if hasattr(args, "force_download_plan") else False
        )
        fuzzy_match = args.fuzzy_match if hasattr(args, "fuzzy_match") else False
        fuzzy_threshold = (
            args.fuzzy_threshold
            if hasattr(args, "fuzzy_threshold")
            else DEFAULT_FUZZY_THRESHOLD
        )
        success = downloader.download_latest_per_channel(
            channel_urls,
            limit=limit,
            force_refresh_download_plan=force_refresh_download_plan,
            fuzzy_match=fuzzy_match,
            fuzzy_threshold=fuzzy_threshold,
            force_download=args.force,
            dry_run=args.dry_run,
        )
    elif args.url:
        success = downloader.download_channel_videos(
            args.url, force_refresh=args.refresh, dry_run=args.dry_run
        )
    else:
        # Default behavior: download from channels (equivalent to --latest-per-channel)
        print("🎯 No specific mode specified, defaulting to download from channels")
        channel_urls = load_channels(args.file)
        if not channel_urls:
            print(f"❌ No channels found in configuration")
            print("Please provide a channel URL or create channels.json in the data directory")
            sys.exit(1)
        limit = args.limit if args.limit else DEFAULT_LATEST_PER_CHANNEL_LIMIT
        force_refresh_download_plan = (
            args.force_download_plan if hasattr(args, "force_download_plan") else False
        )
        fuzzy_match = args.fuzzy_match if hasattr(args, "fuzzy_match") else False
        fuzzy_threshold = (
            args.fuzzy_threshold
            if hasattr(args, "fuzzy_threshold")
            else DEFAULT_FUZZY_THRESHOLD
        )
        success = downloader.download_latest_per_channel(
            channel_urls,
            limit=limit,
            force_refresh_download_plan=force_refresh_download_plan,
            fuzzy_match=fuzzy_match,
            fuzzy_threshold=fuzzy_threshold,
            force_download=args.force,
            dry_run=args.dry_run,
        )
    # Generate unmatched report if requested (additive feature)
    if args.generate_unmatched_report:
        from karaoke_downloader.download_planner import generate_unmatched_report, build_download_plan
        from karaoke_downloader.songlist_manager import load_songlist
        print("\n🔍 Generating unmatched songs report...")
        # Load songlist based on focus mode
        if args.songlist_focus:
            # Load focused playlists
            songlist_file_path = args.songlist_file if args.songlist_file else str(get_data_path_manager().get_songlist_path())
            songlist_file = Path(songlist_file_path)
            if not songlist_file.exists():
                print(f"⚠️ Songlist file not found: {songlist_file_path}")
            else:
                try:
                    with open(songlist_file, "r", encoding="utf-8") as f:
                        raw_data = json.load(f)
                    # Filter playlists by title
                    focused_playlists = []
                    for playlist in raw_data:
                        playlist_title = playlist.get("title", "")
                        if playlist_title in args.songlist_focus:
                            focused_playlists.append(playlist)
                    if focused_playlists:
                        # Flatten the focused playlists into songs
                        focused_songs = []
                        seen = set()
                        for playlist in focused_playlists:
                            if "songs" in playlist:
                                for song in playlist["songs"]:
                                    if "artist" in song and "title" in song:
                                        artist = song["artist"].strip()
                                        title = song["title"].strip()
                                        key = f"{artist.lower()}_{title.lower()}"
                                        if key in seen:
                                            continue
                                        seen.add(key)
                                        focused_songs.append(
                                            {
                                                "artist": artist,
                                                "title": title,
                                                "position": song.get("position", 0),
                                            }
                                        )
                        songlist = focused_songs
                    else:
                        print(f"⚠️ No playlists found matching: {', '.join(args.songlist_focus)}")
                        songlist = []
                except (json.JSONDecodeError, FileNotFoundError) as e:
                    print(f"⚠️ Could not load songlist for report: {e}")
                    songlist = []
        else:
            # Load all songs from songlist
            songlist_path = args.songlist_file if args.songlist_file else str(get_data_path_manager().get_songlist_path())
            songlist = load_songlist(songlist_path)
        if songlist:
            # Load channel URLs
            channel_file = args.file if args.file else str(get_data_path_manager().get_channels_txt_path())
            if os.path.exists(channel_file):
                with open(channel_file, "r", encoding='utf-8') as f:
            channel_urls = [
                line.strip()
                for line in f
                if line.strip() and not line.strip().startswith("#")
            ]
-                
+        limit = args.limit if args.limit else None
-                print(f"📋 Analyzing {len(songlist)} songs against {len(channel_urls)} channels...")
+        force_refresh_download_plan = (
-                
+            args.force_download_plan if hasattr(args, "force_download_plan") else False
-                # Build download plan to get unmatched songs
+        )
        fuzzy_match = args.fuzzy_match if hasattr(args, "fuzzy_match") else False
        fuzzy_threshold = (
            args.fuzzy_threshold
            if hasattr(args, "fuzzy_threshold")
            else DEFAULT_FUZZY_THRESHOLD
        )
-                
+        success = downloader.download_songlist_across_channels(
                try:
                    download_plan, unmatched = build_download_plan(
            channel_urls,
-                        songlist,
+            limit=limit,
-                        downloader.tracker,
+            force_refresh_download_plan=force_refresh_download_plan,
                        downloader.yt_dlp_path,
            fuzzy_match=fuzzy_match,
            fuzzy_threshold=fuzzy_threshold,
        )
-                    
+    elif args.latest_per_channel:
-                    if unmatched:
+        # Use provided file or default to data/channels.txt
-                        report_file = generate_unmatched_report(unmatched)
+        channel_file = args.file if args.file else "data/channels.txt"
-                        print(f"\n📋 Unmatched songs report generated successfully!")
+        if not os.path.exists(channel_file):
                        print(f"📁 Report saved to: {report_file}")
                        print(f"📊 Summary: {len(download_plan)} songs found, {len(unmatched)} songs not found")
                        print(f"\n🔍 First 10 unmatched songs:")
                        for i, song in enumerate(unmatched[:10], 1):
                            print(f"   {i:2d}. {song['artist']} - {song['title']}")
                        if len(unmatched) > 10:
                            print(f"   ... and {len(unmatched) - 10} more songs")
                    else:
                        print(f"\n✅ All {len(songlist)} songs were found in the channels!")
                except Exception as e:
                    print(f"❌ Error generating report: {e}")
            else:
            print(f"❌ Channel file not found: {channel_file}")
            sys.exit(1)
        with open(channel_file, "r", encoding="utf-8") as f:
            channel_urls = [
                line.strip()
                for line in f
                if line.strip() and not line.strip().startswith("#")
            ]
        limit = args.limit if args.limit else DEFAULT_LATEST_PER_CHANNEL_LIMIT
        force_refresh_download_plan = (
            args.force_download_plan if hasattr(args, "force_download_plan") else False
        )
        fuzzy_match = args.fuzzy_match if hasattr(args, "fuzzy_match") else False
        fuzzy_threshold = (
            args.fuzzy_threshold
            if hasattr(args, "fuzzy_threshold")
            else DEFAULT_FUZZY_THRESHOLD
        )
        success = downloader.download_latest_per_channel(
            channel_urls,
            limit=limit,
            force_refresh_download_plan=force_refresh_download_plan,
            fuzzy_match=fuzzy_match,
            fuzzy_threshold=fuzzy_threshold,
        )
    elif args.url:
        success = downloader.download_channel_videos(
            args.url, force_refresh=args.refresh
        )
    else:
-            print("❌ No songlist available for report generation")
+        parser.print_help()
-    
+        sys.exit(1)
    # Initialize success variable
    success = False
    downloader.tracker.force_save()
    if success:
        print("\n🎤 All downloads completed successfully!")
--- a/karaoke_downloader/config_manager.py
+++ b/karaoke_downloader/config_manager.py
@ -36,7 +36,6 @@ DEFAULT_CONFIG = {
    "folder_structure": {
        "downloads_dir": "downloads",
        "logs_dir": "logs",
        "data_dir": "data",
        "tracking_file": "data/karaoke_tracking.json",
    },
    "logging": {
@ -49,8 +48,9 @@ DEFAULT_CONFIG = {
        "auto_detect_platform": True,
        "yt_dlp_paths": {
            "windows": "downloader/yt-dlp.exe",
-            "macos": "downloader/yt-dlp_macos"
+            "macos": "downloader/yt-dlp_macos",
-        }
+            "linux": "downloader/yt-dlp",
        },
    },
    "yt_dlp_path": "downloader/yt-dlp.exe",
 }
@ -66,20 +66,23 @@ RESOLUTION_MAP = {
 def detect_platform() -> str:
-    """Detect the current platform and return platform name."""
+    """Detect the current platform and return the appropriate platform key."""
    system = platform.system().lower()
    if system == "windows":
        return "windows"
    elif system == "darwin":
        return "macos"
    elif system == "linux":
        return "linux"
    else:
-        return "windows"  # Default to Windows for other platforms
+        # Default to windows for unknown platforms
        return "windows"
 def get_platform_yt_dlp_path(platform_paths: Dict[str, str]) -> str:
    """Get the appropriate yt-dlp path for the current platform."""
-    platform_name = detect_platform()
+    platform_key = detect_platform()
-    return platform_paths.get(platform_name, platform_paths.get("windows", "downloader/yt-dlp.exe"))
+    return platform_paths.get(platform_key, platform_paths.get("windows", "downloader/yt-dlp.exe"))
@dataclass
@ -136,7 +139,6 @@ class FolderStructure:
    downloads_dir: str = "downloads"
    logs_dir: str = "logs"
    data_dir: str = "data"
    tracking_file: str = "data/karaoke_tracking.json"
@ -167,21 +169,14 @@ class ConfigManager:
    Manages application configuration with loading, validation, and caching.
    """
-    def __init__(self, config_file: Union[str, Path] = "config/config.json", data_dir: Optional[str] = None):
+    def __init__(self, config_file: Union[str, Path] = "data/config.json"):
        """
        Initialize the configuration manager.
        Args:
            config_file: Path to the configuration file
            data_dir: Optional custom data directory path
        """
        # If config_file is relative and data_dir is provided, make it relative to data_dir
        if data_dir and not Path(config_file).is_absolute():
            self.config_file = Path(data_dir) / config_file
        else:
        self.config_file = Path(config_file)
        self._data_dir = data_dir
        self._config: Optional[AppConfig] = None
        self._last_modified: Optional[datetime] = None
@ -342,35 +337,27 @@ class ConfigManager:
 _config_manager: Optional[ConfigManager] = None
-def get_config_manager(config_file: Optional[Union[str, Path]] = None, data_dir: Optional[str] = None) -> ConfigManager:
+def get_config_manager() -> ConfigManager:
    """
    Get the global configuration manager instance.
    Args:
        config_file: Optional path to config file (default: "config.json" in root)
        data_dir: Optional custom data directory path
    Returns:
        ConfigManager instance
    """
    global _config_manager
-    if _config_manager is None or config_file is not None or data_dir is not None:
+    if _config_manager is None:
-        if config_file is None:
+        _config_manager = ConfigManager()
            config_file = "config/config.json"
        _config_manager = ConfigManager(config_file, data_dir)
    return _config_manager
-def load_config(force_reload: bool = False, config_file: Optional[Union[str, Path]] = None, data_dir: Optional[str] = None) -> AppConfig:
+def load_config(force_reload: bool = False) -> AppConfig:
    """
    Load configuration using the global manager.
    Args:
        force_reload: Force reload even if file hasn't changed
        config_file: Optional path to config file (default: "config.json" in root)
        data_dir: Optional custom data directory path
    Returns:
        AppConfig instance
    """
-    return get_config_manager(config_file, data_dir).load_config(force_reload)
+    return get_config_manager().load_config(force_reload)
--- a/karaoke_downloader/data_path_manager.py
+++ b/karaoke_downloader/data_path_manager.py
@ -1,184 +0,0 @@
 """
 Data path management utilities for the karaoke downloader.
 Provides centralized data directory path management and file path resolution.
 """
 import os
 from pathlib import Path
 from typing import Optional
 from .config_manager import get_config_manager
 class DataPathManager:
    """
    Manages data directory paths and provides utilities for resolving file paths
    relative to the configured data directory.
    """
    def __init__(self, data_dir: Optional[str] = None):
        """
        Initialize the data path manager.
        Args:
            data_dir: Optional custom data directory path. If None, uses config.
        """
        self._data_dir = data_dir
        # If a custom data directory is provided, look for config.json in that directory
        if data_dir:
            config_file = Path(data_dir) / "config.json"
            self._config_manager = get_config_manager(str(config_file))
        else:
            # Otherwise, use the default config.json in the root directory
            self._config_manager = get_config_manager()
    @property
    def data_dir(self) -> Path:
        """
        Get the configured data directory path.
        Returns:
            Path to the data directory
        """
        if self._data_dir:
            return Path(self._data_dir)
        # Get from config
        config = self._config_manager.get_config()
        data_dir = getattr(config.folder_structure, 'data_dir', 'data')
        return Path(data_dir)
    def get_path(self, filename: str) -> Path:
        """
        Get the full path to a file in the data directory.
        Args:
            filename: Name of the file (e.g., 'config.json', 'channels.json')
        Returns:
            Full path to the file
        """
        return self.data_dir / filename
    def get_channels_json_path(self) -> Path:
        """Get path to channels.json file."""
        return self.get_path('channels.json')
    def get_channels_txt_path(self) -> Path:
        """Get path to channels.txt file."""
        return self.get_path('channels.txt')
    def get_songlist_path(self) -> Path:
        """Get path to songList.json file."""
        return self.get_path('songList.json')
    def get_songlist_tracking_path(self) -> Path:
        """Get path to songlist_tracking.json file."""
        return self.get_path('songlist_tracking.json')
    def get_karaoke_tracking_path(self) -> Path:
        """Get path to karaoke_tracking.json file."""
        return self.get_path('karaoke_tracking.json')
    def get_server_duplicates_tracking_path(self) -> Path:
        """Get path to server_duplicates_tracking.json file."""
        return self.get_path('server_duplicates_tracking.json')
    def get_manual_videos_path(self) -> Path:
        """Get path to manual_videos.json file."""
        return self.get_path('manual_videos.json')
    def get_songs_path(self) -> Path:
        """Get path to songs.json file."""
        return self.get_path('songs.json')
    def get_channel_cache_dir(self) -> Path:
        """Get path to channel_cache directory."""
        return self.get_path('channel_cache')
    def get_channel_cache_path(self, channel_id: str) -> Path:
        """Get path to a specific channel cache file."""
        return self.get_channel_cache_dir() / f"{channel_id}.json"
    def get_download_plan_cache_path(self, plan_name: str, **kwargs) -> Path:
        """Get path to download plan cache file."""
        # Create a hash from kwargs for unique cache files
        import hashlib
        if kwargs:
            kwargs_str = str(sorted(kwargs.items()))
            hash_suffix = hashlib.md5(kwargs_str.encode()).hexdigest()[:8]
            plan_name = f"{plan_name}_{hash_suffix}"
        return self.get_path(f"plan_latest_per_channel_{plan_name}.json")
    def get_unmatched_report_path(self, timestamp: Optional[str] = None) -> Path:
        """Get path to unmatched songs report file."""
        if timestamp:
            return self.get_path(f"unmatched_songs_report_{timestamp}.json")
        return self.get_path("unmatched_songs_report.json")
    def ensure_data_dir_exists(self) -> None:
        """Ensure the data directory exists."""
        self.data_dir.mkdir(parents=True, exist_ok=True)
    def list_data_files(self) -> list:
        """List all files in the data directory."""
        if not self.data_dir.exists():
            return []
        files = []
        for file_path in self.data_dir.iterdir():
            if file_path.is_file():
                files.append(file_path.name)
        return sorted(files)
    def file_exists(self, filename: str) -> bool:
        """Check if a file exists in the data directory."""
        return self.get_path(filename).exists()
 # Global data path manager instance
 _data_path_manager: Optional[DataPathManager] = None
 def get_data_path_manager(data_dir: Optional[str] = None) -> DataPathManager:
    """
    Get the global data path manager instance.
    Args:
        data_dir: Optional custom data directory path
    Returns:
        DataPathManager instance
    """
    global _data_path_manager
    if _data_path_manager is None or data_dir is not None:
        _data_path_manager = DataPathManager(data_dir)
    return _data_path_manager
 def get_data_path(filename: str, data_dir: Optional[str] = None) -> Path:
    """
    Get the full path to a file in the data directory.
    Args:
        filename: Name of the file
        data_dir: Optional custom data directory path
    Returns:
        Full path to the file
    """
    return get_data_path_manager(data_dir).get_path(filename)
 def get_data_dir(data_dir: Optional[str] = None) -> Path:
    """
    Get the configured data directory path.
    Args:
        data_dir: Optional custom data directory path
    Returns:
        Path to the data directory
    """
    return get_data_path_manager(data_dir).data_dir
--- a/karaoke_downloader/download_pipeline.py
+++ b/karaoke_downloader/download_pipeline.py
@ -20,12 +20,6 @@ from karaoke_downloader.youtube_utils import (
    execute_yt_dlp_command,
    show_available_formats,
 )
 from karaoke_downloader.file_utils import (
    cleanup_temp_files,
    get_unique_filename,
    is_valid_mp4_file,
    sanitize_filename,
 )
 class DownloadPipeline:
@ -69,15 +63,9 @@ class DownloadPipeline:
            True if successful, False otherwise
        """
        try:
-            # Step 1: Prepare file path and check for existing files
+            # Step 1: Prepare file path
-            output_path, file_exists = get_unique_filename(self.downloads_dir, channel_name, artist, title)
+            filename = sanitize_filename(artist, title)
-            
+            output_path = self.downloads_dir / channel_name / filename
            if file_exists:
                print(f"⏭️  Skipping download - file already exists: {output_path.name}")
                # Still add tags and track the existing file
                if self._add_tags(output_path, artist, title, channel_name):
                    self._track_download(output_path, artist, title, video_id, channel_name)
                return True
            # Step 2: Download video
            if not self._download_video(video_id, output_path, artist, title, channel_name):
@ -226,10 +214,8 @@ class DownloadPipeline:
    ) -> bool:
        """Step 3: Add ID3 tags to the downloaded file."""
        try:
            # Use the same artist/title as the filename for consistency
            # Don't add "(Karaoke Version)" to the ID3 tag title
            add_id3_tags(
-                output_path, f"{artist} - {title}", channel_name
+                output_path, f"{artist} - {title} (Karaoke Version)", channel_name
            )
            print(f"🏷️  Added ID3 tags: {artist} - {title}")
            return True
@ -297,10 +283,9 @@ class DownloadPipeline:
            video_title = video.get("title", "")
            # Extract artist and title from video title
-            from karaoke_downloader.channel_parser import ChannelParser
+            from karaoke_downloader.id3_utils import extract_artist_title
-            channel_parser = ChannelParser()
+            artist, title = extract_artist_title(video_title)
            artist, title = channel_parser.extract_artist_title(video_title, channel_name)
            print(f"   ({i}/{total}) Processing: {artist} - {title}")
--- a/karaoke_downloader/download_planner.py
+++ b/karaoke_downloader/download_planner.py
@ -3,31 +3,19 @@ Download plan building utilities.
 Handles pre-scanning channels and building download plans.
 """
 import concurrent.futures
 import hashlib
 import json
 import sys
 from datetime import datetime
 from pathlib import Path
 from typing import Any, Dict, List, Optional, Tuple
 from karaoke_downloader.cache_manager import (
    delete_plan_cache,
    get_download_plan_cache_file,
    load_cached_plan,
    save_plan_cache,
 )
 # Import all fuzzy matching functions
 from karaoke_downloader.fuzzy_matcher import (
    create_song_key,
-    create_video_key,
+    extract_artist_title,
    get_similarity_function,
    is_exact_match,
    is_fuzzy_match,
    normalize_title,
 )
 from karaoke_downloader.channel_parser import ChannelParser
 from karaoke_downloader.data_path_manager import get_data_path_manager
 from karaoke_downloader.youtube_utils import get_channel_info
 # Constants
@ -35,156 +23,6 @@ DEFAULT_FILENAME_LENGTH_LIMIT = 100
 DEFAULT_ARTIST_LENGTH_LIMIT = 30
 DEFAULT_TITLE_LENGTH_LIMIT = 60
 DEFAULT_FUZZY_THRESHOLD = 85
 DEFAULT_DISPLAY_LIMIT = 10
 def generate_unmatched_report(unmatched: List[Dict[str, Any]], report_path: str = None) -> str:
    """
    Generate a detailed report of unmatched songs and save it to a file.
    Args:
        unmatched: List of unmatched songs from build_download_plan
        report_path: Optional path to save the report (default: data/unmatched_songs_report.json)
    Returns:
        Path to the saved report file
    """
    if report_path is None:
        timestamp = datetime.now().strftime("%Y%m%d_%H%M%S")
        report_path = str(get_data_path_manager().get_unmatched_report_path(timestamp))
    report_data = {
        "generated_at": datetime.now().isoformat(),
        "total_unmatched": len(unmatched),
        "unmatched_songs": []
    }
    for song in unmatched:
        report_data["unmatched_songs"].append({
            "artist": song["artist"],
            "title": song["title"],
            "position": song.get("position", 0),
            "search_key": create_song_key(song["artist"], song["title"])
        })
    # Sort by artist, then by title for easier reading
    report_data["unmatched_songs"].sort(key=lambda x: (x["artist"].lower(), x["title"].lower()))
    # Ensure the data directory exists
    report_file = Path(report_path)
    report_file.parent.mkdir(parents=True, exist_ok=True)
    # Save the report
    with open(report_file, 'w', encoding='utf-8') as f:
        json.dump(report_data, f, indent=2, ensure_ascii=False)
    return str(report_file)
 def _scan_channel_for_matches(
    channel_url,
    channel_name,
    channel_id,
    song_keys,
    song_lookup,
    fuzzy_match,
    fuzzy_threshold,
    show_pagination,
    yt_dlp_path,
    tracker,
 ):
    """
    Scan a single channel for matches (used in parallel processing).
    Args:
        channel_url: URL of the channel to scan
        channel_name: Name of the channel
        channel_id: ID of the channel
        song_keys: Set of song keys to match against
        song_lookup: Dictionary mapping song keys to song data
        fuzzy_match: Whether to use fuzzy matching
        fuzzy_threshold: Threshold for fuzzy matching
        show_pagination: Whether to show pagination progress
        yt_dlp_path: Path to yt-dlp executable
        tracker: Tracking manager instance
    Returns:
        List of video matches found in this channel
    """
    print(f"\n🚦 Scanning channel: {channel_name} ({channel_url})")
    # Get channel info if not provided
    if not channel_name or not channel_id:
        channel_name, channel_id = get_channel_info(channel_url)
    # Fetch video list from channel
    available_videos = tracker.get_channel_video_list(
        channel_url, yt_dlp_path=str(yt_dlp_path), force_refresh=False, show_pagination=show_pagination
    )
    print(f"   📊 Channel has {len(available_videos)} videos to scan")
    video_matches = []
    # Pre-process video titles for efficient matching
    channel_parser = ChannelParser()
    if fuzzy_match:
        # For fuzzy matching, create normalized video keys
        for video in available_videos:
            v_artist, v_title = channel_parser.extract_artist_title(video["title"], channel_name)
            video_key = create_song_key(v_artist, v_title)
            # Find best match among remaining songs
            best_match = None
            best_score = 0
            for song_key in song_keys:
                if song_key in song_lookup:  # Only check unmatched songs
                    score = get_similarity_function()(song_key, video_key)
                    if score >= fuzzy_threshold and score > best_score:
                        best_score = score
                        best_match = song_key
            if best_match:
                song = song_lookup[best_match]
                video_matches.append(
                    {
                        "artist": song["artist"],
                        "title": song["title"],
                        "channel_name": channel_name,
                        "channel_url": channel_url,
                        "video_id": video["id"],
                        "video_title": video["title"],
                        "match_score": best_score,
                    }
                )
                # Remove matched song from future consideration
                del song_lookup[best_match]
                song_keys.remove(best_match)
    else:
        # For exact matching, use direct key comparison
        for video in available_videos:
            v_artist, v_title = channel_parser.extract_artist_title(video["title"], channel_name)
            video_key = create_song_key(v_artist, v_title)
            if video_key in song_keys:
                song = song_lookup[video_key]
                video_matches.append(
                    {
                        "artist": song["artist"],
                        "title": song["title"],
                        "channel_name": channel_name,
                        "channel_url": channel_url,
                        "video_id": video["id"],
                        "video_title": video["title"],
                        "match_score": 100,
                    }
                )
                # Remove matched song from future consideration
                del song_lookup[video_key]
                song_keys.remove(video_key)
    print(f"   ✅ Found {len(video_matches)} matches in {channel_name}")
    return video_matches
 def build_download_plan(
@ -194,9 +32,6 @@ def build_download_plan(
    yt_dlp_path,
    fuzzy_match=False,
    fuzzy_threshold=DEFAULT_FUZZY_THRESHOLD,
    show_pagination=False,
    parallel_channels=False,
    max_channel_workers=3,
 ):
    """
    For each song in undownloaded, scan all channels for a match.
@ -217,120 +52,6 @@ def build_download_plan(
        song_keys.add(key)
        song_lookup[key] = song
    if parallel_channels:
        print(f"🚀 Running parallel channel scanning with {max_channel_workers} workers.")
        # Create a thread-safe copy of song data for parallel processing
        import threading
        song_keys_lock = threading.Lock()
        song_lookup_lock = threading.Lock()
        def scan_channel_safe(channel_url):
            """Thread-safe channel scanning function."""
            print(f"\n🚦 Scanning channel: {channel_url}")
            # Get channel info
            channel_name, channel_id = get_channel_info(channel_url)
            print(f"   ✅ Channel info: {channel_name} (ID: {channel_id})")
            # Fetch video list from channel
            available_videos = tracker.get_channel_video_list(
                channel_url, yt_dlp_path=str(yt_dlp_path), force_refresh=False, show_pagination=show_pagination
            )
            print(f"   📊 Channel has {len(available_videos)} videos to scan")
            video_matches = []
            # Pre-process video titles for efficient matching
            channel_parser = ChannelParser()
            if fuzzy_match:
                # For fuzzy matching, create normalized video keys
                for video in available_videos:
                    v_artist, v_title = channel_parser.extract_artist_title(video["title"], channel_name)
                    video_key = create_song_key(v_artist, v_title)
                    # Find best match among remaining songs (thread-safe)
                    best_match = None
                    best_score = 0
                    with song_keys_lock:
                        available_song_keys = list(song_keys)  # Copy for iteration
                    for song_key in available_song_keys:
                        with song_lookup_lock:
                            if song_key in song_lookup:  # Only check unmatched songs
                                score = get_similarity_function()(song_key, video_key)
                                if score >= fuzzy_threshold and score > best_score:
                                    best_score = score
                                    best_match = song_key
                    if best_match:
                        with song_lookup_lock:
                            if best_match in song_lookup:  # Double-check it's still available
                                song = song_lookup[best_match]
                                video_matches.append(
                                    {
                                        "artist": song["artist"],
                                        "title": song["title"],
                                        "channel_name": channel_name,
                                        "channel_url": channel_url,
                                        "video_id": video["id"],
                                        "video_title": video["title"],
                                        "match_score": best_score,
                                    }
                                )
                                # Remove matched song from future consideration
                                del song_lookup[best_match]
                                with song_keys_lock:
                                    song_keys.discard(best_match)
            else:
                # For exact matching, use direct key comparison
                for video in available_videos:
                    v_artist, v_title = channel_parser.extract_artist_title(video["title"], channel_name)
                    video_key = create_song_key(v_artist, v_title)
                    with song_lookup_lock:
                        if video_key in song_keys and video_key in song_lookup:
                            song = song_lookup[video_key]
                            video_matches.append(
                                {
                                    "artist": song["artist"],
                                    "title": song["title"],
                                    "channel_name": channel_name,
                                    "channel_url": channel_url,
                                    "video_id": video["id"],
                                    "video_title": video["title"],
                                    "match_score": 100,
                                }
                            )
                            # Remove matched song from future consideration
                            del song_lookup[video_key]
                            with song_keys_lock:
                                song_keys.discard(video_key)
            print(f"   ✅ Found {len(video_matches)} matches in {channel_name}")
            return video_matches
        # Execute parallel channel scanning
        with concurrent.futures.ThreadPoolExecutor(max_workers=max_channel_workers) as executor:
            # Submit all channel scanning tasks
            future_to_channel = {
                executor.submit(scan_channel_safe, channel_url): channel_url 
                for channel_url in channel_urls
            }
            # Process results as they complete
            for future in concurrent.futures.as_completed(future_to_channel):
                channel_url = future_to_channel[future]
                try:
                    video_matches = future.result()
                    plan.extend(video_matches)
                    channel_name, _ = get_channel_info(channel_url)
                    channel_match_counts[channel_name] = len(video_matches)
                except Exception as e:
                    print(f"⚠️ Error processing channel {channel_url}: {e}")
                    channel_name, _ = get_channel_info(channel_url)
                    channel_match_counts[channel_name] = 0
    else:
    for i, channel_url in enumerate(channel_urls, 1):
        print(f"\n🚦 Starting channel {i}/{len(channel_urls)}: {channel_url}")
        print(f"   🔍 Getting channel info...")
@ -338,7 +59,7 @@ def build_download_plan(
        print(f"   ✅ Channel info: {channel_name} (ID: {channel_id})")
        print(f"   🔍 Fetching video list from channel...")
        available_videos = tracker.get_channel_video_list(
-                channel_url, yt_dlp_path=str(yt_dlp_path), force_refresh=False, show_pagination=show_pagination
+            channel_url, yt_dlp_path=str(yt_dlp_path), force_refresh=False
        )
        print(
            f"   📊 Channel has {len(available_videos)} videos to scan against {len(undownloaded)} songlist songs"
@ -347,11 +68,10 @@ def build_download_plan(
        video_matches = []  # Initialize video_matches for this channel
        # Pre-process video titles for efficient matching
            channel_parser = ChannelParser()
        if fuzzy_match:
            # For fuzzy matching, create normalized video keys
            for video in available_videos:
-                    v_artist, v_title = channel_parser.extract_artist_title(video["title"], channel_name)
+                v_artist, v_title = extract_artist_title(video["title"])
                video_key = create_song_key(v_artist, v_title)
                # Find best match among remaining songs
@ -384,7 +104,7 @@ def build_download_plan(
        else:
            # For exact matching, use direct key comparison
            for video in available_videos:
-                    v_artist, v_title = channel_parser.extract_artist_title(video["title"], channel_name)
+                v_artist, v_title = extract_artist_title(video["title"])
                video_key = create_song_key(v_artist, v_title)
                if video_key in song_keys:
@ -423,13 +143,4 @@ def build_download_plan(
        f"   TOTAL: {sum(channel_match_counts.values())} matches across {len(channel_match_counts)} channels."
    )
    # Generate unmatched songs report if there are any
    if unmatched:
        try:
            report_file = generate_unmatched_report(unmatched)
            print(f"\n📋 Unmatched songs report saved to: {report_file}")
            print(f"📋 Total unmatched songs: {len(unmatched)}")
        except Exception as e:
            print(f"⚠️ Could not generate unmatched songs report: {e}")
    return plan, unmatched
--- a/karaoke_downloader/downloader.py
+++ b/karaoke_downloader/downloader.py
--- a/karaoke_downloader/file_utils.py
+++ b/karaoke_downloader/file_utils.py
@ -34,6 +34,7 @@ def sanitize_filename(
    # Clean up title
    safe_title = (
        title.replace("(From ", "")
        .replace(")", "")
        .replace(" - ", " ")
        .replace(":", "")
    )
@ -53,18 +54,11 @@ def sanitize_filename(
    )
    safe_artist = safe_artist.strip()
-    # Create filename - handle empty artist case
+    # Create filename
    if not safe_artist or safe_artist.strip() == "":
        # If no artist, just use the title
        filename = f"{safe_title}.mp4"
    else:
    filename = f"{safe_artist} - {safe_title}.mp4"
    # Limit filename length if needed
    if len(filename) > max_length:
        if not safe_artist or safe_artist.strip() == "":
            filename = f"{safe_title[:DEFAULT_TITLE_LENGTH_LIMIT]}.mp4"
        else:
        filename = f"{safe_artist[:DEFAULT_ARTIST_LENGTH_LIMIT]} - {safe_title[:DEFAULT_TITLE_LENGTH_LIMIT]}.mp4"
    return filename
@ -87,14 +81,6 @@ def generate_possible_filenames(
    safe_title = sanitize_title_for_filenames(title)
    safe_artist = artist.replace("'", "").replace('"', "").strip()
    # Handle empty artist case
    if not safe_artist or safe_artist.strip() == "":
        return [
            f"{safe_title}.mp4",  # Songlist mode (no artist)
            f"{channel_name} - {safe_title}.mp4",  # Latest-per-channel mode
            f"{safe_title} (Karaoke Version).mp4",  # Channel videos mode (no artist)
        ]
    else:
    return [
        f"{safe_artist} - {safe_title}.mp4",  # Songlist mode
        f"{channel_name} - {safe_title}.mp4",  # Latest-per-channel mode
@ -126,7 +112,6 @@ def check_file_exists_with_patterns(
 ) -> Tuple[bool, Optional[Path]]:
    """
    Check if a file exists using multiple possible filename patterns.
    Also checks for files with (2), (3), etc. suffixes that yt-dlp might create.
    Args:
        downloads_dir: Base downloads directory
@ -145,56 +130,15 @@ def check_file_exists_with_patterns(
            # Apply length limits if needed
            safe_artist = artist.replace("'", "").replace('"', "").strip()
            safe_title = sanitize_title_for_filenames(title)
            if not safe_artist or safe_artist.strip() == "":
                filename = f"{safe_title[:DEFAULT_TITLE_LENGTH_LIMIT]}.mp4"
            else:
            filename = f"{safe_artist[:DEFAULT_ARTIST_LENGTH_LIMIT]} - {safe_title[:DEFAULT_TITLE_LENGTH_LIMIT]}.mp4"
        # Check for exact filename match
        file_path = channel_dir / filename
        if file_path.exists() and file_path.stat().st_size > 0:
            return True, file_path
        # Check for files with (2), (3), etc. suffixes
        base_name = filename.replace(".mp4", "")
        for suffix in range(2, 10):  # Check up to (9)
            suffixed_filename = f"{base_name} ({suffix}).mp4"
            suffixed_path = channel_dir / suffixed_filename
            if suffixed_path.exists() and suffixed_path.stat().st_size > 0:
                return True, suffixed_path
    return False, None
 def get_unique_filename(
    downloads_dir: Path, channel_name: str, artist: str, title: str
 ) -> Tuple[Path, bool]:
    """
    Get a unique filename for download, checking for existing files including duplicates.
    Args:
        downloads_dir: Base downloads directory
        channel_name: Channel name
        artist: Song artist
        title: Song title
    Returns:
        Tuple of (file_path, is_existing) where is_existing indicates if a file already exists
    """
    filename = sanitize_filename(artist, title)
    channel_dir = downloads_dir / channel_name
    file_path = channel_dir / filename
    # Check if file already exists
    exists, existing_path = check_file_exists_with_patterns(downloads_dir, channel_name, artist, title)
    if exists and existing_path:
        print(f"📁 File already exists: {existing_path.name}")
        return existing_path, True
    return file_path, False
 def ensure_directory_exists(directory: Path) -> None:
    """
    Ensure a directory exists, creating it if necessary.
--- a/karaoke_downloader/fuzzy_matcher.py
+++ b/karaoke_downloader/fuzzy_matcher.py
@ -32,72 +32,10 @@ def normalize_title(title):
 def extract_artist_title(video_title):
-    """
+    """Extract artist and title from video title."""
    Extract artist and title from video title.
    This function handles multiple common video title formats found on YouTube karaoke channels:
    1. "Artist - Title" format: "38 Special - Hold On Loosely"
    2. "Title Karaoke | Artist Karaoke Version" format: "Hold On Loosely Karaoke | 38 Special Karaoke Version"
    3. "Title Artist KARAOKE" format: "Hold On Loosely 38 Special KARAOKE"
    Args:
        video_title (str): The YouTube video title to parse
    Returns:
        tuple: (artist, title) where artist and title are strings. If parsing fails,
               artist will be empty string and title will be the full video title.
    Examples:
        >>> extract_artist_title("38 Special - Hold On Loosely")
        ("38 Special", "Hold On Loosely")
        >>> extract_artist_title("Hold On Loosely Karaoke | 38 Special Karaoke Version")
        ("38 Special", "Hold On Loosely")
        >>> extract_artist_title("Unknown Format Video Title")
        ("", "Unknown Format Video Title")
    """
    # Handle "Artist - Title" format
    if " - " in video_title:
        parts = video_title.split(" - ", 1)
        return parts[0].strip(), parts[1].strip()
    # Handle "Title Karaoke | Artist Karaoke Version" format
    if " | " in video_title and "karaoke" in video_title.lower():
        parts = video_title.split(" | ", 1)
        title_part = parts[0].strip()
        artist_part = parts[1].strip()
        # Clean up the parts
        title = title_part.replace("Karaoke", "").strip()
        artist = artist_part.replace("Karaoke Version", "").strip()
        return artist, title
    # Handle "Title Artist KARAOKE" format
    if "karaoke" in video_title.lower():
        # Try to find the artist by looking for common patterns
        title_lower = video_title.lower()
        # Look for patterns like "Title Artist KARAOKE"
        # This is a simplified approach - we'll need to improve this
        words = video_title.split()
        if len(words) >= 3:
            # Assume the last word before "KARAOKE" is part of the artist
            for i, word in enumerate(words):
                if "karaoke" in word.lower():
                    if i >= 2:
                        # Everything before the last word before KARAOKE is title
                        # Everything after is artist
                        title = " ".join(words[:i-1])
                        artist = " ".join(words[i-1:])
                        return artist, title
        # If we can't parse it, return empty artist and full title
        return "", video_title
    # Default: return empty artist and full title
    return "", video_title
--- a/karaoke_downloader/id3_utils.py
+++ b/karaoke_downloader/id3_utils.py
@ -7,33 +7,17 @@ except ImportError:
    MUTAGEN_AVAILABLE = False
-def clean_channel_name(channel_name: str) -> str:
+def extract_artist_title(video_title):
-    """
+    title = (
-    Clean channel name for ID3 tagging by removing @ symbol and ensuring it's alpha-only.
+        video_title.replace("(Karaoke Version)", "").replace("(Karaoke)", "").strip()
-    
+    )
-    Args:
+    if " - " in title:
-        channel_name: Raw channel name (may contain @ symbol)
+        parts = title.split(" - ", 1)
-        
+        if len(parts) == 2:
-    Returns:
+            artist = parts[0].strip()
-        Cleaned channel name suitable for ID3 tags
+            song_title = parts[1].strip()
-    """
+            return artist, song_title
-    # Remove @ symbol if present
+    return "Unknown Artist", title
    if channel_name.startswith('@'):
        channel_name = channel_name[1:]
    # Remove any non-alphanumeric characters and convert to single word
    # Keep only letters, numbers, and spaces, then take the first word
    cleaned = re.sub(r'[^a-zA-Z0-9\s]', '', channel_name)
    words = cleaned.split()
    if words:
        return words[0]  # Return only the first word
    return "Unknown"
 # Import the enhanced extract_artist_title function from fuzzy_matcher.py
 # This ensures consistent parsing across all modules and supports multiple video title formats
 from karaoke_downloader.fuzzy_matcher import extract_artist_title
 def add_id3_tags(file_path, video_title, channel_name):
@ -42,13 +26,12 @@ def add_id3_tags(file_path, video_title, channel_name):
        return
    try:
        artist, title = extract_artist_title(video_title)
        clean_channel = clean_channel_name(channel_name)
        mp4 = MP4(str(file_path))
        mp4["\xa9nam"] = title
        mp4["\xa9ART"] = artist
-        mp4["\xa9alb"] = clean_channel  # Use clean channel name only, no suffix
+        mp4["\xa9alb"] = f"{channel_name} Karaoke"
        mp4["\xa9gen"] = "Karaoke"
        mp4.save()
-        print(f"📝 Added ID3 tags: Artist='{artist}', Title='{title}', Album='{clean_channel}'")
+        print(f"📝 Added ID3 tags: Artist='{artist}', Title='{title}'")
    except Exception as e:
        print(f"⚠️ Could not add ID3 tags: {e}")
--- a/karaoke_downloader/manual_video_manager.py
+++ b/karaoke_downloader/manual_video_manager.py
@ -1,83 +0,0 @@
 """
 Manual video manager for handling static video collections.
 """
 import json
 from pathlib import Path
 from typing import Dict, List, Optional, Any
 from karaoke_downloader.data_path_manager import get_data_path_manager
 def load_manual_videos(manual_file: str = None) -> List[Dict[str, Any]]:
    if manual_file is None:
        manual_file = str(get_data_path_manager().get_manual_videos_path())
    """
    Load manual videos from the JSON file.
    Args:
        manual_file: Path to manual videos JSON file
    Returns:
        List of video dictionaries
    """
    manual_path = Path(manual_file)
    if not manual_path.exists():
        print(f"⚠️  Manual videos file not found: {manual_file}")
        return []
    try:
        with open(manual_path, 'r', encoding='utf-8') as f:
            data = json.load(f)
        videos = data.get("videos", [])
        print(f"📋 Loaded {len(videos)} manual videos from {manual_file}")
        return videos
    except Exception as e:
        print(f"❌ Error loading manual videos: {e}")
        return []
 def get_manual_videos_for_channel(channel_name: str, manual_file: str = None) -> List[Dict[str, Any]]:
    if manual_file is None:
        manual_file = str(get_data_path_manager().get_manual_videos_path())
    """
    Get manual videos for a specific channel.
    Args:
        channel_name: Channel name (should be "@ManualVideos")
        manual_file: Path to manual videos JSON file
    Returns:
        List of video dictionaries
    """
    if channel_name != "@ManualVideos":
        return []
    return load_manual_videos(manual_file)
 def is_manual_channel(channel_url: str) -> bool:
    """
    Check if a channel URL is a manual channel.
    Args:
        channel_url: Channel URL
    Returns:
        True if it's a manual channel
    """
    return channel_url == "manual://static"
 def get_manual_channel_info(channel_url: str) -> tuple[str, str]:
    """
    Get channel info for manual channels.
    Args:
        channel_url: Channel URL
    Returns:
        Tuple of (channel_name, channel_id)
    """
    if channel_url == "manual://static":
        return "@ManualVideos", "manual"
    return None, None 
--- a/karaoke_downloader/resolution_cli.py
+++ b/karaoke_downloader/resolution_cli.py
@ -56,6 +56,14 @@ def update_resolution(resolution):
                "include_console": True,
                "include_file": True,
            },
            "platform_settings": {
                "auto_detect_platform": True,
                "yt_dlp_paths": {
                    "windows": "downloader/yt-dlp.exe",
                    "macos": "downloader/yt-dlp_macos",
                    "linux": "downloader/yt-dlp",
                },
            },
            "yt_dlp_path": "downloader/yt-dlp.exe",
        }
--- a/karaoke_downloader/server_manager.py
+++ b/karaoke_downloader/server_manager.py
@ -7,40 +7,28 @@ import json
 from datetime import datetime
 from pathlib import Path
 from karaoke_downloader.data_path_manager import get_data_path_manager
-
+def load_server_songs(songs_path="data/songs.json"):
-def load_server_songs(songs_path=None):
+    """Load the list of songs already available on the server."""
    if songs_path is None:
        songs_path = str(get_data_path_manager().get_songs_path())
    """Load the list of songs already available on the server with format information."""
    songs_file = Path(songs_path)
    if not songs_file.exists():
        print(f"⚠️ Server songs file not found: {songs_path}")
-        return {}
+        return set()
    try:
        with open(songs_file, "r", encoding="utf-8") as f:
            data = json.load(f)
-        server_songs = {}
+        server_songs = set()
        for song in data:
-            if "artist" in song and "title" in song and "path" in song:
+            if "artist" in song and "title" in song:
                artist = song["artist"].strip()
                title = song["title"].strip()
                path = song["path"].strip()
                key = f"{artist.lower()}_{normalize_title(title)}"
-                server_songs[key] = {
+                server_songs.add(key)
                    "artist": artist,
                    "title": title,
                    "path": path,
                    "is_mp3": path.lower().endswith('.mp3'),
                    "is_cdg": 'cdg' in path.lower(),
                    "is_mp4": path.lower().endswith('.mp4')
                }
        print(f"📋 Loaded {len(server_songs)} songs from server (songs.json)")
        return server_songs
    except (json.JSONDecodeError, FileNotFoundError) as e:
        print(f"⚠️ Could not load server songs: {e}")
-        return {}
+        return set()
 def is_song_on_server(server_songs, artist, title):
@ -49,24 +37,9 @@ def is_song_on_server(server_songs, artist, title):
    return key in server_songs
 def should_skip_server_song(server_songs, artist, title):
    """Check if a song should be skipped because it's already available as MP4 on server.
    Returns True if the song should be skipped (MP4 format), False if it should be downloaded (MP3/CDG format)."""
    key = f"{artist.lower()}_{normalize_title(title)}"
    if key not in server_songs:
        return False  # Not on server, so don't skip
    song_info = server_songs[key]
    # Skip if it's an MP4 file (video format)
    # Don't skip if it's MP3 or in CDG folder (different format)
    return song_info.get("is_mp4", False) and not song_info.get("is_cdg", False)
 def load_server_duplicates_tracking(
-    tracking_path=None,
+    tracking_path="data/server_duplicates_tracking.json",
 ):
    if tracking_path is None:
        tracking_path = str(get_data_path_manager().get_server_duplicates_tracking_path())
    """Load the tracking of songs found to be duplicates on the server."""
    tracking_file = Path(tracking_path)
    if not tracking_file.exists():
@ -80,10 +53,8 @@ def load_server_duplicates_tracking(
 def save_server_duplicates_tracking(
-    tracking, tracking_path=None
+    tracking, tracking_path="data/server_duplicates_tracking.json"
 ):
    if tracking_path is None:
        tracking_path = str(get_data_path_manager().get_server_duplicates_tracking_path())
    """Save the tracking of songs found to be duplicates on the server."""
    try:
        with open(tracking_path, "w", encoding="utf-8") as f:
@ -115,9 +86,8 @@ def mark_song_as_server_duplicate(tracking, artist, title, video_title, channel_
 def check_and_mark_server_duplicate(
    server_songs, server_duplicates_tracking, artist, title, video_title, channel_name
 ):
-    """Check if a song should be skipped because it's already available as MP4 on server and mark it as duplicate if so. 
+    """Check if a song is on server and mark it as duplicate if so. Returns True if it's a duplicate."""
-    Returns True if it should be skipped (MP4 format), False if it should be downloaded (MP3/CDG format)."""
+    if is_song_on_server(server_songs, artist, title):
    if should_skip_server_song(server_songs, artist, title):
        if not is_song_marked_as_server_duplicate(
            server_duplicates_tracking, artist, title
        ):
--- a/karaoke_downloader/song_validator.py
+++ b/karaoke_downloader/song_validator.py
@ -35,7 +35,6 @@ class SongValidator:
        video_title: Optional[str] = None,
        server_songs: Optional[Dict[str, Any]] = None,
        server_duplicates_tracking: Optional[Dict[str, Any]] = None,
        force_download: bool = False,
    ) -> Tuple[bool, Optional[str], int]:
        """
        Check if a song should be skipped based on multiple criteria.
@ -54,15 +53,10 @@ class SongValidator:
            video_title: YouTube video title (optional)
            server_songs: Server songs data (optional)
            server_duplicates_tracking: Server duplicates tracking (optional)
            force_download: If True, bypass all validation checks and force download
        Returns:
            Tuple of (should_skip, reason, total_filtered)
        """
        # If force download is enabled, skip all validation checks
        if force_download:
            return False, None, 0
        total_filtered = 0
        # Check 1: Already downloaded by this system
--- a/karaoke_downloader/songlist_generator.py
+++ b/karaoke_downloader/songlist_generator.py
@ -1,265 +0,0 @@
 import json
 import os
 from pathlib import Path
 from typing import List, Dict, Any, Optional
 from mutagen.mp4 import MP4
 from karaoke_downloader.data_path_manager import get_data_path_manager
 class SongListGenerator:
    """Utility class for generating song lists from MP4 files with ID3 tags."""
    def __init__(self, songlist_path: str = None):
        if songlist_path is None:
            songlist_path = str(get_data_path_manager().get_songlist_path())
        self.songlist_path = Path(songlist_path)
        self.songlist_path.parent.mkdir(parents=True, exist_ok=True)
    def read_existing_songlist(self) -> List[Dict[str, Any]]:
        """Read existing song list from JSON file."""
        if self.songlist_path.exists():
            try:
                with open(self.songlist_path, 'r', encoding='utf-8') as f:
                    return json.load(f)
            except (json.JSONDecodeError, IOError) as e:
                print(f"⚠️ Warning: Could not read existing songlist: {e}")
                return []
        return []
    def save_songlist(self, songlist: List[Dict[str, Any]]) -> None:
        """Save song list to JSON file."""
        try:
            with open(self.songlist_path, 'w', encoding='utf-8') as f:
                json.dump(songlist, f, indent=2, ensure_ascii=False)
            print(f"✅ Song list saved to {self.songlist_path}")
        except IOError as e:
            print(f"❌ Error saving song list: {e}")
            raise
    def extract_id3_tags(self, mp4_path: Path) -> Optional[Dict[str, str]]:
        """Extract ID3 tags from MP4 file."""
        try:
            mp4 = MP4(str(mp4_path))
            # Extract artist and title from ID3 tags
            artist = mp4.get("\xa9ART", ["Unknown Artist"])[0] if "\xa9ART" in mp4 else "Unknown Artist"
            title = mp4.get("\xa9nam", ["Unknown Title"])[0] if "\xa9nam" in mp4 else "Unknown Title"
            return {
                "artist": artist,
                "title": title
            }
        except Exception as e:
            print(f"⚠️ Warning: Could not extract ID3 tags from {mp4_path.name}: {e}")
            return None
    def scan_directory_for_mp4_files(self, directory_path: str) -> List[Path]:
        """Scan directory for MP4 files."""
        directory = Path(directory_path)
        if not directory.exists():
            raise FileNotFoundError(f"Directory not found: {directory_path}")
        if not directory.is_dir():
            raise ValueError(f"Path is not a directory: {directory_path}")
        mp4_files = list(directory.glob("*.mp4"))
        if not mp4_files:
            print(f"⚠️ No MP4 files found in {directory_path}")
            return []
        print(f"📁 Found {len(mp4_files)} MP4 files in {directory.name}")
        return sorted(mp4_files)
    def generate_songlist_from_directory(self, directory_path: str, append: bool = True) -> Dict[str, Any]:
        """Generate a song list from MP4 files in a directory."""
        directory = Path(directory_path)
        directory_name = directory.name
        # Scan for MP4 files
        mp4_files = self.scan_directory_for_mp4_files(directory_path)
        if not mp4_files:
            return {}
        # Extract ID3 tags and create songs list
        songs = []
        for index, mp4_file in enumerate(mp4_files, start=1):
            id3_tags = self.extract_id3_tags(mp4_file)
            if id3_tags:
                song = {
                    "position": index,
                    "title": id3_tags["title"],
                    "artist": id3_tags["artist"]
                }
                songs.append(song)
                print(f"  {index:3d}. {id3_tags['artist']} - {id3_tags['title']}")
        if not songs:
            print("❌ No valid ID3 tags found in any MP4 files")
            return {}
        # Create the song list entry
        songlist_entry = {
            "title": directory_name,
            "songs": songs
        }
        # Handle appending to existing song list
        if append:
            existing_songlist = self.read_existing_songlist()
            # Check if a playlist with this title already exists
            existing_index = None
            for i, entry in enumerate(existing_songlist):
                if entry.get("title") == directory_name:
                    existing_index = i
                    break
            if existing_index is not None:
                # Replace existing entry
                print(f"🔄 Replacing existing playlist: {directory_name}")
                existing_songlist[existing_index] = songlist_entry
            else:
                # Add new entry to the beginning of the list
                print(f"➕ Adding new playlist: {directory_name}")
                existing_songlist.insert(0, songlist_entry)
            self.save_songlist(existing_songlist)
        else:
            # Create new song list with just this entry
            print(f"📝 Creating new song list with playlist: {directory_name}")
            self.save_songlist([songlist_entry])
        return songlist_entry
    def generate_songlist_from_multiple_directories(self, directory_paths: List[str], append: bool = True) -> List[Dict[str, Any]]:
        """Generate song lists from multiple directories."""
        results = []
        errors = []
        # Read existing song list once at the beginning
        existing_songlist = self.read_existing_songlist() if append else []
        for directory_path in directory_paths:
            try:
                print(f"\n📂 Processing directory: {directory_path}")
                directory = Path(directory_path)
                directory_name = directory.name
                # Scan for MP4 files
                mp4_files = self.scan_directory_for_mp4_files(directory_path)
                if not mp4_files:
                    continue
                # Extract ID3 tags and create songs list
                songs = []
                for index, mp4_file in enumerate(mp4_files, start=1):
                    id3_tags = self.extract_id3_tags(mp4_file)
                    if id3_tags:
                        song = {
                            "position": index,
                            "title": id3_tags["title"],
                            "artist": id3_tags["artist"]
                        }
                        songs.append(song)
                        print(f"  {index:3d}. {id3_tags['artist']} - {id3_tags['title']}")
                if not songs:
                    print("❌ No valid ID3 tags found in any MP4 files")
                    continue
                # Create the song list entry
                songlist_entry = {
                    "title": directory_name,
                    "songs": songs
                }
                # Check if a playlist with this title already exists
                existing_index = None
                for i, entry in enumerate(existing_songlist):
                    if entry.get("title") == directory_name:
                        existing_index = i
                        break
                if existing_index is not None:
                    # Replace existing entry
                    print(f"🔄 Replacing existing playlist: {directory_name}")
                    existing_songlist[existing_index] = songlist_entry
                else:
                    # Add new entry to the beginning of the list
                    print(f"➕ Adding new playlist: {directory_name}")
                    existing_songlist.insert(0, songlist_entry)
                results.append(songlist_entry)
            except Exception as e:
                error_msg = f"Error processing {directory_path}: {e}"
                print(f"❌ {error_msg}")
                errors.append(error_msg)
        # Save the final song list
        if results:
            if append:
                # Save the updated existing song list
                self.save_songlist(existing_songlist)
            else:
                # Create new song list with just the results
                self.save_songlist(results)
        # If there were any errors, raise an exception
        if errors:
            raise Exception(f"Failed to process {len(errors)} directories: {'; '.join(errors)}")
        return results
 def main():
    """CLI entry point for song list generation."""
    import argparse
    import sys
    parser = argparse.ArgumentParser(
        description="Generate song lists from MP4 files with ID3 tags",
        formatter_class=argparse.RawDescriptionHelpFormatter,
        epilog="""
 Examples:
  python -m karaoke_downloader.songlist_generator /path/to/mp4/directory
  python -m karaoke_downloader.songlist_generator /path/to/dir1 /path/to/dir2 --no-append
  python -m karaoke_downloader.songlist_generator /path/to/dir --songlist-path custom_songlist.json
        """
    )
    parser.add_argument(
        "directories",
        nargs="+",
        help="Directory paths containing MP4 files with ID3 tags"
    )
    parser.add_argument(
        "--no-append",
        action="store_true",
        help="Create a new song list instead of appending to existing one"
    )
    parser.add_argument(
        "--songlist-path",
        default=None,
        help="Path to the song list JSON file (default: songList.json in the data directory)"
    )
    args = parser.parse_args()
    try:
        generator = SongListGenerator(args.songlist_path)
        generator.generate_songlist_from_multiple_directories(
            args.directories, 
            append=not args.no_append
        )
        print("\n✅ Song list generation completed successfully!")
    except Exception as e:
        print(f"\n❌ Error: {e}")
        sys.exit(1)
 if __name__ == "__main__":
    main() 
--- a/karaoke_downloader/songlist_manager.py
+++ b/karaoke_downloader/songlist_manager.py
@ -7,7 +7,6 @@ import json
 from datetime import datetime
 from pathlib import Path
 from karaoke_downloader.data_path_manager import get_data_path_manager
 from karaoke_downloader.server_manager import (
    check_and_mark_server_duplicate,
    is_song_marked_as_server_duplicate,
@ -17,9 +16,7 @@ from karaoke_downloader.server_manager import (
 )
-def load_songlist(songlist_path=None):
+def load_songlist(songlist_path="data/songList.json"):
    if songlist_path is None:
        songlist_path = str(get_data_path_manager().get_songlist_path())
    songlist_file = Path(songlist_path)
    if not songlist_file.exists():
        print(f"⚠️ Songlist file not found: {songlist_path}")
@ -58,9 +55,7 @@ def normalize_title(title):
    return " ".join(normalized.split()).lower()
-def load_songlist_tracking(tracking_path=None):
+def load_songlist_tracking(tracking_path="data/songlist_tracking.json"):
    if tracking_path is None:
        tracking_path = str(get_data_path_manager().get_songlist_tracking_path())
    tracking_file = Path(tracking_path)
    if not tracking_file.exists():
        return {}
@ -72,9 +67,7 @@ def load_songlist_tracking(tracking_path=None):
        return {}
-def save_songlist_tracking(tracking, tracking_path=None):
+def save_songlist_tracking(tracking, tracking_path="data/songlist_tracking.json"):
    if tracking_path is None:
        tracking_path = str(get_data_path_manager().get_songlist_tracking_path())
    try:
        with open(tracking_path, "w", encoding="utf-8") as f:
            json.dump(tracking, f, indent=2, ensure_ascii=False)
--- a/karaoke_downloader/tracking_manager.py
+++ b/karaoke_downloader/tracking_manager.py
@ -1,12 +1,10 @@
-import json
+import threading
 import os
 import re
 from datetime import datetime, timedelta
 from enum import Enum
 from pathlib import Path
 from typing import Any, Dict, List, Optional, Tuple
-from karaoke_downloader.data_path_manager import get_data_path_manager
+import json
 from datetime import datetime
 from pathlib import Path
 class SongStatus(str, Enum):
    NOT_DOWNLOADED = "NOT_DOWNLOADED"
@ -27,133 +25,46 @@ class FormatType(str, Enum):
 class TrackingManager:
    def __init__(
        self,
-        tracking_file=None,
+        tracking_file="data/karaoke_tracking.json",
-        cache_dir=None,
+        cache_file="data/channel_cache.json",
    ):
        if tracking_file is None:
            tracking_file = str(get_data_path_manager().get_karaoke_tracking_path())
        if cache_dir is None:
            cache_dir = str(get_data_path_manager().get_channel_cache_dir())
        self.tracking_file = Path(tracking_file)
-        self.cache_dir = Path(cache_dir)
+        self.cache_file = Path(cache_file)
-        
+        self.data = {"playlists": {}, "songs": {}}
-        # Ensure cache directory exists
+        self.cache = {}
-        self.cache_dir.mkdir(parents=True, exist_ok=True)
+        self._lock = threading.Lock()
-        
+        self._load()
-        self.data = self._load()
+        self._load_cache()
        print(f"📊 Tracking manager initialized with {len(self.data.get('songs', {}))} tracked songs")
    def _load(self):
        """Load tracking data from JSON file."""
        if self.tracking_file.exists():
            try:
                with open(self.tracking_file, "r", encoding="utf-8") as f:
-                    return json.load(f)
+                    self.data = json.load(f)
-            except json.JSONDecodeError:
+            except Exception:
-                print(f"⚠️  Corrupted tracking file, creating new one")
+                self.data = {"playlists": {}, "songs": {}}
        return {"songs": {}, "playlists": {}, "last_updated": datetime.now().isoformat()}
    def _save(self):
-        """Save tracking data to JSON file."""
+        with self._lock:
        self.data["last_updated"] = datetime.now().isoformat()
        self.tracking_file.parent.mkdir(parents=True, exist_ok=True)
            with open(self.tracking_file, "w", encoding="utf-8") as f:
                json.dump(self.data, f, indent=2, ensure_ascii=False)
    def force_save(self):
        """Force save the tracking data."""
        self._save()
-    def _get_channel_cache_file(self, channel_id: str) -> Path:
+    def _load_cache(self):
-        """Get the cache file path for a specific channel."""
+        if self.cache_file.exists():
        # Sanitize channel ID for filename
        safe_channel_id = re.sub(r'[<>:"/\\|?*]', '_', channel_id)
        return self.cache_dir / f"{safe_channel_id}.json"
    def _load_channel_cache(self, channel_id: str) -> List[Dict[str, str]]:
        """Load cache for a specific channel."""
        cache_file = self._get_channel_cache_file(channel_id)
        if cache_file.exists():
            try:
-                with open(cache_file, 'r', encoding='utf-8') as f:
+                with open(self.cache_file, "r", encoding="utf-8") as f:
-                    data = json.load(f)
+                    self.cache = json.load(f)
-                    return data.get('videos', [])
+            except Exception:
-            except (json.JSONDecodeError, KeyError):
+                self.cache = {}
                print(f"   ⚠️  Corrupted cache file for {channel_id}, will recreate")
                return []
        return []
-    def _save_channel_cache(self, channel_id: str, videos: List[Dict[str, str]]):
+    def save_cache(self):
-        """Save cache for a specific channel."""
+        with open(self.cache_file, "w", encoding="utf-8") as f:
-        cache_file = self._get_channel_cache_file(channel_id)
+            json.dump(self.cache, f, indent=2, ensure_ascii=False)
        data = {
            'channel_id': channel_id,
            'videos': videos,
            'last_updated': datetime.now().isoformat(),
            'video_count': len(videos)
        }
        with open(cache_file, 'w', encoding='utf-8') as f:
            json.dump(data, f, indent=2, ensure_ascii=False)
    def _clear_channel_cache(self, channel_id: str):
        """Clear cache for a specific channel."""
        cache_file = self._get_channel_cache_file(channel_id)
        if cache_file.exists():
            cache_file.unlink()
            print(f"   🗑️  Cleared cache file: {cache_file.name}")
    def get_cache_info(self):
        """Get information about all channel cache files."""
        cache_files = list(self.cache_dir.glob("*.json"))
        total_videos = 0
        cache_info = []
        for cache_file in cache_files:
            try:
                with open(cache_file, 'r', encoding='utf-8') as f:
                    data = json.load(f)
                    video_count = len(data.get('videos', []))
                    total_videos += video_count
                    last_updated = data.get('last_updated', 'Unknown')
                    cache_info.append({
                        'channel': data.get('channel_id', cache_file.stem),
                        'videos': video_count,
                        'last_updated': last_updated,
                        'file': cache_file.name
                    })
            except Exception as e:
                print(f"⚠️  Error reading cache file {cache_file.name}: {e}")
        return {
            'total_channels': len(cache_files),
            'total_videos': total_videos,
            'channels': cache_info
        }
    def clear_channel_cache(self, channel_id=None):
        """Clear cache for a specific channel or all channels."""
        if channel_id:
            self._clear_channel_cache(channel_id)
            print(f"🗑️  Cleared cache for channel: {channel_id}")
        else:
            # Clear all cache files
            cache_files = list(self.cache_dir.glob("*.json"))
            for cache_file in cache_files:
                cache_file.unlink()
            print(f"🗑️  Cleared all {len(cache_files)} channel cache files")
    def set_cache_duration(self, hours):
        """Placeholder for cache duration logic"""
        pass
    def export_playlist_report(self, playlist_id):
        """Export a report for a specific playlist."""
        pass
    def get_statistics(self):
        """Get statistics about tracked songs."""
        total_songs = len(self.data["songs"])
        downloaded_songs = sum(
            1
@ -191,13 +102,11 @@ class TrackingManager:
        }
    def get_playlist_songs(self, playlist_id):
        """Get songs for a specific playlist."""
        return [
            s for s in self.data["songs"].values() if s["playlist_id"] == playlist_id
        ]
    def get_failed_songs(self, playlist_id=None):
        """Get failed songs, optionally filtered by playlist."""
        if playlist_id:
            return [
                s
@ -209,7 +118,6 @@ class TrackingManager:
        ]
    def get_partial_downloads(self, playlist_id=None):
        """Get partial downloads, optionally filtered by playlist."""
        if playlist_id:
            return [
                s
@ -221,7 +129,7 @@ class TrackingManager:
        ]
    def cleanup_orphaned_files(self, downloads_dir):
-        """Remove tracking entries for files that no longer exist."""
+        # Remove tracking entries for files that no longer exist
        orphaned = []
        for song_id, song in list(self.data["songs"].items()):
            file_path = song.get("file_path")
@ -231,17 +139,51 @@ class TrackingManager:
        self.force_save()
        return orphaned
    def get_cache_info(self):
        total_channels = len(self.cache)
        total_cached_videos = sum(len(v) for v in self.cache.values())
        cache_duration_hours = 24  # default
        last_updated = None
        return {
            "total_channels": total_channels,
            "total_cached_videos": total_cached_videos,
            "cache_duration_hours": cache_duration_hours,
            "last_updated": last_updated,
        }
    def clear_channel_cache(self, channel_id=None):
        if channel_id is None or channel_id == "all":
            self.cache = {}
        else:
            self.cache.pop(channel_id, None)
        self.save_cache()
    def set_cache_duration(self, hours):
        # Placeholder for cache duration logic
        pass
    def export_playlist_report(self, playlist_id):
        playlist = self.data["playlists"].get(playlist_id)
        if not playlist:
            return f"Playlist '{playlist_id}' not found."
        songs = self.get_playlist_songs(playlist_id)
        report = {"playlist": playlist, "songs": songs}
        return json.dumps(report, indent=2, ensure_ascii=False)
    def is_song_downloaded(self, artist, title, channel_name=None, video_id=None):
        """
-        Check if a song has already been downloaded.
+        Check if a song has already been downloaded by this system.
-        Returns True if the song exists in tracking with DOWNLOADED status.
+        Returns True if the song exists in tracking with DOWNLOADED or CONVERTED status.
        """
        # If we have video_id and channel_name, try direct key lookup first (most efficient)
        if video_id and channel_name:
            song_key = f"{video_id}@{channel_name}"
            if song_key in self.data["songs"]:
                song_data = self.data["songs"][song_key]
-                if song_data.get("status") == SongStatus.DOWNLOADED:
+                if song_data.get("status") in [
                    SongStatus.DOWNLOADED,
                    SongStatus.CONVERTED,
                ]:
                    return True
        # Fallback to content search (for cases where we don't have video_id)
@ -249,14 +191,19 @@ class TrackingManager:
            # Check if this song matches the artist and title
            if song_data.get("artist") == artist and song_data.get("title") == title:
                # Check if it's marked as downloaded
-                if song_data.get("status") == SongStatus.DOWNLOADED:
+                if song_data.get("status") in [
                    SongStatus.DOWNLOADED,
                    SongStatus.CONVERTED,
                ]:
                    return True
            # Also check the video title field which might contain the song info
            video_title = song_data.get("video_title", "")
            if video_title and artist in video_title and title in video_title:
-                if song_data.get("status") == SongStatus.DOWNLOADED:
+                if song_data.get("status") in [
                    SongStatus.DOWNLOADED,
                    SongStatus.CONVERTED,
                ]:
                    return True
        return False
    def is_file_exists(self, file_path):
@ -336,359 +283,114 @@ class TrackingManager:
        self._save()
    def get_channel_video_list(
-        self, channel_url, yt_dlp_path="downloader/yt-dlp.exe", force_refresh=False, show_pagination=False
+        self, channel_url, yt_dlp_path=None, force_refresh=False
    ):
        """
        Return a list of videos (dicts with 'title' and 'id') for the channel, using cache if available unless force_refresh is True.
        Args:
            channel_url: YouTube channel URL
            yt_dlp_path: Path to yt-dlp executable
            force_refresh: Force refresh cache even if available
            show_pagination: Show page-by-page progress (slower but more detailed)
        """
        # Use platform-aware path if none provided
        if yt_dlp_path is None:
            from karaoke_downloader.config_manager import load_config
            config = load_config()
            yt_dlp_path = config.yt_dlp_path
        channel_name, channel_id = None, None
        # Check if this is a manual channel
        from karaoke_downloader.manual_video_manager import is_manual_channel, get_manual_channel_info, get_manual_videos_for_channel
        if is_manual_channel(channel_url):
            channel_name, channel_id = get_manual_channel_info(channel_url)
            if channel_name and channel_id:
                print(f"   📋 Loading manual videos for {channel_name}")
                manual_videos = get_manual_videos_for_channel(channel_name)
                # Convert to the expected format
                videos = []
                for video in manual_videos:
                    videos.append({
                        "title": video.get("title", ""),
                        "id": video.get("id", ""),
                        "url": video.get("url", "")
                    })
                print(f"   ✅ Loaded {len(videos)} manual videos")
                return videos
            else:
                print(f"   ❌ Could not get manual channel info for: {channel_url}")
                return []
        # Regular YouTube channel processing
        from karaoke_downloader.youtube_utils import get_channel_info
        channel_name, channel_id = get_channel_info(channel_url)
-        if not channel_id:
+        # Check if cache has the old flat structure or new nested structure
-            print(f"   ❌ Could not extract channel ID from URL: {channel_url}")
+        cache_data = None
-            return []
+        cache_key = None
-        print(f"   🔍 Channel: {channel_name} (ID: {channel_id})")
+        # Try nested structure first (new format)
-
+        if "channels" in self.cache:
-        # Check if we have cached data for this channel
+            # Try multiple possible cache keys in nested structure
-        if not force_refresh:
+            possible_keys = [
-            cached_videos = self._load_channel_cache(channel_id)
+                channel_id,  # The extracted channel ID
-            if cached_videos:
+                channel_url,  # The full URL
-                # Validate that the cached data has proper video IDs
+                channel_name,  # The extracted channel name
                corrupted = False
                # Check if any video IDs look like titles instead of proper YouTube IDs
                for video in cached_videos[:20]:  # Check first 20 videos
                    video_id = video.get("id", "")
                    # More comprehensive validation - YouTube IDs should be 11 characters and contain only alphanumeric, hyphens, and underscores
                    if video_id and (
                        len(video_id) != 11 or 
                        not video_id.replace('-', '').replace('_', '').isalnum() or
                        " " in video_id or 
                        "Lyrics" in video_id or
                        "KARAOKE" in video_id.upper() or
                        "Vocal" in video_id or
                        "Guide" in video_id
                    ):
                        print(f"   ⚠️  Detected corrupted video ID in cache: '{video_id}'")
                        corrupted = True
                        break
                if corrupted:
                    print(f"   🧹 Clearing corrupted cache for {channel_id}")
                    self._clear_channel_cache(channel_id)
                    force_refresh = True
                else:
                    print(f"   📋 Using cached video list ({len(cached_videos)} videos)")
                    return cached_videos
        # Choose fetch method based on show_pagination flag
        if show_pagination:
            return self._fetch_videos_with_pagination(channel_url, channel_id, yt_dlp_path)
        else:
            return self._fetch_videos_flat_playlist(channel_url, channel_id, yt_dlp_path)
    def _fetch_videos_with_pagination(self, channel_url, channel_id, yt_dlp_path):
        """Fetch videos showing page-by-page progress."""
        print(f"   🌐 Fetching video list from YouTube (page-by-page mode)...")
        print(f"   📡 Channel URL: {channel_url}")
        import subprocess
        all_videos = []
        page = 1
        videos_per_page = 200  # YouTube/yt-dlp supports up to 200 videos per page, reducing API calls and errors
        while True:
            print(f"   📄 Fetching page {page}...")
            # Fetch one page at a time
            cmd = [
                yt_dlp_path,
                "--flat-playlist",
                "--print",
                "%(title)s|%(id)s|%(url)s",
                "--playlist-start",
                str((page - 1) * videos_per_page + 1),
                "--playlist-end",
                str(page * videos_per_page),
                channel_url,
            ]
-            try:
+            for key in possible_keys:
-                # Increased timeout to 180 seconds for larger pages (200 videos)
+                if key and key in self.cache["channels"]:
-                result = subprocess.run(cmd, capture_output=True, text=True, check=True, timeout=180)
+                    cache_data = self.cache["channels"][key]["videos"]
-                lines = result.stdout.strip().splitlines()
+                    cache_key = key
                # Save raw output for debugging (for each page)
                raw_output_file = self._get_channel_cache_file(channel_id).parent / f"{channel_id}_raw_output_page{page}.txt"
                try:
                    with open(raw_output_file, 'w', encoding='utf-8') as f:
                        f.write(f"# Raw yt-dlp output for {channel_id} - Page {page}\n")
                        f.write(f"# Channel URL: {channel_url}\n")
                        f.write(f"# Command: {' '.join(cmd)}\n")
                        f.write(f"# Timestamp: {datetime.now().isoformat()}\n")
                        f.write(f"# Total lines: {len(lines)}\n")
                        f.write("#" * 80 + "\n\n")
                        for i, line in enumerate(lines, 1):
                            f.write(f"{i:6d}: {line}\n")
                    print(f"   💾 Saved raw output to: {raw_output_file.name}")
                except Exception as e:
                    print(f"   ⚠️  Could not save raw output: {e}")
                if not lines:
                    print(f"   ✅ No more videos found on page {page}")
                    break
-                print(f"   📊 Page {page}: Found {len(lines)} videos")
+        # Try flat structure (old format) as fallback
        if cache_data is None:
            possible_keys = [
                channel_id,  # The extracted channel ID
                channel_url,  # The full URL
                channel_name,  # The extracted channel name
            ]
-                page_videos = []
+            for key in possible_keys:
-                invalid_count = 0
+                if key and key in self.cache:
                    cache_data = self.cache[key]
                    cache_key = key
                    break
-                for line in lines:
+        if not cache_key:
-                    if not line.strip():
+            cache_key = channel_id or channel_url  # Use as fallback for new entries
                        continue
-                    # More robust parsing that handles titles with | characters
+        print(f"   🔍 Trying cache keys: {possible_keys}")
-                    # Extract video ID directly from the URL that yt-dlp provides
+        print(f"   🔍 Selected cache key: '{cache_key}'")
-                    # Find the URL and extract video ID from it
+        if not force_refresh and cache_data is not None:
-                    url_match = re.search(r'https://www\.youtube\.com/watch\?v=([a-zA-Z0-9_-]{11})', line)
+            print(
-                    if not url_match:
+                f"   📋 Using cached video list ({len(cache_data)} videos)"
-                        continue
+            )
-                    
+            # Convert old cache format to new format if needed
-                    # Extract video ID directly from the URL
+            converted_videos = []
-                    video_id = url_match.group(1)
+            for video in cache_data:
-                    
+                if "video_id" in video and "id" not in video:
-                    # Extract title (everything before the video ID in the line)
+                    # Convert old format to new format
-                    title = line[:line.find(video_id)].rstrip('|').strip()
+                    converted_videos.append({
-                    
+                        "title": video["title"],
-                    # Validate video ID
+                        "id": video["video_id"]
-                    if video_id and (
+                    })
                        len(video_id) == 11 and 
                        video_id.replace('-', '').replace('_', '').isalnum() and
                        " " not in video_id and 
                        "Lyrics" not in video_id and
                        "KARAOKE" not in video_id.upper() and
                        "Vocal" not in video_id and
                        "Guide" not in video_id
                    ):
                        page_videos.append({"title": title, "id": video_id})
                else:
-                        invalid_count += 1
+                    # Already in new format
-                        if invalid_count <= 3:  # Show first 3 invalid IDs per page
+                    converted_videos.append(video)
-                            print(f"      ⚠️  Invalid ID: '{video_id}' for '{title[:50]}...'")
+            return converted_videos
-                
+        else:
-                if invalid_count > 3:
+            print(f"   ❌ Cache miss for all keys")
                    print(f"      ⚠️  ... and {invalid_count - 3} more invalid IDs on this page")
                all_videos.extend(page_videos)
                print(f"   ✅ Page {page}: Added {len(page_videos)} valid videos (total: {len(all_videos)})")
                # If we got fewer videos than expected, we're probably at the end
                if len(lines) < videos_per_page:
                    print(f"   🏁 Reached end of channel (last page had {len(lines)} videos)")
                    break
                page += 1
                # Safety check to prevent infinite loops
                if page > 50:  # Max 50 pages (10,000 videos with 200 per page)
                    print(f"   ⚠️  Reached maximum page limit (50 pages), stopping")
                    break
            except subprocess.TimeoutExpired:
                print(f"   ⚠️  Page {page} timed out, stopping")
                break
            except subprocess.CalledProcessError as e:
                print(f"   ❌ Error fetching page {page}: {e}")
                break
            except KeyboardInterrupt:
                print(f"   ⏹️  User interrupted, stopping at page {page}")
                break
        if not all_videos:
            print(f"   ❌ No valid videos found")
            return []
        print(f"   🎉 Channel download complete!")
        print(f"   📊 Total videos fetched: {len(all_videos)}")
        # Save to individual channel cache file
        self._save_channel_cache(channel_id, all_videos)
        print(f"   💾 Saved cache to: {self._get_channel_cache_file(channel_id).name}")
        return all_videos
    def _fetch_videos_flat_playlist(self, channel_url, channel_id, yt_dlp_path):
        """Fetch all videos using flat playlist (faster but less detailed progress)."""
        # Fetch with yt-dlp
        print(f"   🌐 Fetching video list from YouTube (this may take a while)...")
        print(f"   📡 Channel URL: {channel_url}")
        import subprocess
        from karaoke_downloader.youtube_utils import _parse_yt_dlp_command
        # First, let's get the total count to show progress
        count_cmd = _parse_yt_dlp_command(yt_dlp_path) + [
            "--flat-playlist",
            "--print",
            "%(title)s",
            "--playlist-end",
            "1",  # Just get first video to test
            channel_url,
        ]
        try:
            print(f"   🔍 Testing channel access...")
            test_result = subprocess.run(count_cmd, capture_output=True, text=True, timeout=30)
            if test_result.returncode == 0:
                print(f"   ✅ Channel is accessible")
            else:
                print(f"   ⚠️  Channel test failed: {test_result.stderr}")
        except subprocess.TimeoutExpired:
            print(f"   ⚠️  Channel test timed out")
        except Exception as e:
            print(f"   ⚠️  Channel test error: {e}")
        # Now fetch all videos with progress indicators
        cmd = _parse_yt_dlp_command(yt_dlp_path) + [
            "--flat-playlist",
            "--print",
            "%(title)s|%(id)s|%(url)s",
            "--verbose",  # Add verbose output to see what's happening
            channel_url,
        ]
        try:
-            print(f"   🔧 Running yt-dlp command: {' '.join(cmd)}")
+            result = subprocess.run(cmd, capture_output=True, text=True, check=True)
            print(f"   📥 Starting video list download...")
            # Use a timeout and show progress
            result = subprocess.run(cmd, capture_output=True, text=True, check=True, timeout=300)
            lines = result.stdout.strip().splitlines()
            # Save raw output for debugging
            raw_output_file = self._get_channel_cache_file(channel_id).parent / f"{channel_id}_raw_output.txt"
            try:
                with open(raw_output_file, 'w', encoding='utf-8') as f:
                    f.write(f"# Raw yt-dlp output for {channel_id}\n")
                    f.write(f"# Channel URL: {channel_url}\n")
                    f.write(f"# Command: {' '.join(cmd)}\n")
                    f.write(f"# Timestamp: {datetime.now().isoformat()}\n")
                    f.write(f"# Total lines: {len(lines)}\n")
                    f.write("#" * 80 + "\n\n")
                    for i, line in enumerate(lines, 1):
                        f.write(f"{i:6d}: {line}\n")
                print(f"   💾 Saved raw output to: {raw_output_file.name}")
            except Exception as e:
                print(f"   ⚠️  Could not save raw output: {e}")
            print(f"   📄 Raw output lines: {len(lines)}")
            print(f"   📊 Download completed successfully!")
            # Show some sample lines to understand the format
            if lines:
                print(f"   📋 Sample output format:")
                for i, line in enumerate(lines[:3]):
                    print(f"      Line {i+1}: {line[:100]}...")
                if len(lines) > 3:
                    print(f"      ... and {len(lines) - 3} more lines")
            videos = []
-            invalid_count = 0
+            for line in lines:
-            
+                parts = line.split("|")
-            print(f"   🔍 Processing {len(lines)} video entries...")
+                if len(parts) >= 2:
-            
+                    title, video_id = parts[0].strip(), parts[1].strip()
            for i, line in enumerate(lines):
                if i % 1000 == 0 and i > 0:  # Progress indicator every 1000 lines
                    print(f"   📊 Processing line {i}/{len(lines)}... ({i/len(lines)*100:.1f}%)")
                # More robust parsing that handles titles with | characters
                # Extract video ID directly from the URL that yt-dlp provides
                # Find the URL and extract video ID from it
                url_match = re.search(r'https://www\.youtube\.com/watch\?v=([a-zA-Z0-9_-]{11})', line)
                if not url_match:
                    invalid_count += 1
                    if invalid_count <= 5:
                        print(f"   ⚠️  Skipping line with no URL: '{line[:100]}...'")
                    elif invalid_count == 6:
                        print(f"   ⚠️  ... and {len(lines) - i - 1} more invalid lines")
                    continue
                # Extract video ID directly from the URL
                video_id = url_match.group(1)
                # Extract title (everything before the video ID in the line)
                title = line[:line.find(video_id)].rstrip('|').strip()
                # Validate video ID
                if video_id and (
                    len(video_id) == 11 and 
                    video_id.replace('-', '').replace('_', '').isalnum() and
                    " " not in video_id and 
                    "Lyrics" not in video_id and
                    "KARAOKE" not in video_id.upper() and
                    "Vocal" not in video_id and
                    "Guide" not in video_id
                ):
                    videos.append({"title": title, "id": video_id})
                else:
                    invalid_count += 1
                    if invalid_count <= 5:  # Only show first 5 invalid IDs
                        print(f"   ⚠️  Skipping invalid video ID: '{video_id}' for title: '{title[:50]}...'")
                    elif invalid_count == 6:
                        print(f"   ⚠️  ... and {len(lines) - i - 1} more invalid IDs")
-            if not videos:
+            # Save in nested structure format
-                print(f"   ❌ No valid videos found after parsing")
+            if "channels" not in self.cache:
-                return []
+                self.cache["channels"] = {}
-            print(f"   ✅ Parsed {len(videos)} valid videos from YouTube")
+            self.cache["channels"][cache_key] = {
-            print(f"   ⚠️  Skipped {invalid_count} invalid video IDs")
+                "videos": videos,
-            
+                "last_updated": datetime.now().isoformat(),
-            # Save to individual channel cache file
+                "channel_name": channel_name,
-            self._save_channel_cache(channel_id, videos)
+                "channel_id": channel_id
-            print(f"   💾 Saved cache to: {self._get_channel_cache_file(channel_id).name}")
+            }
            self.save_cache()
            return videos
        except subprocess.TimeoutExpired:
            print(f"❌ yt-dlp timed out after 5 minutes - channel may be too large")
            return []
        except subprocess.CalledProcessError as e:
            print(f"❌ yt-dlp failed to fetch playlist for cache: {e}")
            print(f"   📄 stderr: {e.stderr}")
            return []
--- a/karaoke_downloader/video_downloader.py
+++ b/karaoke_downloader/video_downloader.py
@ -107,10 +107,6 @@ def download_single_video(
    video_url = f"https://www.youtube.com/watch?v={video_id}"
    # Debug: Show the video_id and URL being used
    print(f"🔍 DEBUG: video_id = '{video_id}'")
    print(f"🔍 DEBUG: video_url = '{video_url}'")
    # Build command using centralized utility
    cmd = build_yt_dlp_command(yt_dlp_path, video_url, output_path, config)
@ -259,7 +255,7 @@ def execute_download_plan(
        video_id = item["video_id"]
        video_title = item["video_title"]
-        print(f"\n⬇️  Downloading {downloaded_count + 1} of {total_to_download}:")
+        print(f"\n⬇️  Downloading {len(download_plan) - idx} of {total_to_download}:")
        print(f"   📋 Songlist: {artist} - {title}")
        print(f"   🎬 Video: {video_title} ({channel_name})")
        if "match_score" in item:
--- a/karaoke_downloader/youtube_utils.py
+++ b/karaoke_downloader/youtube_utils.py
@ -23,9 +23,15 @@ def _parse_yt_dlp_command(yt_dlp_path: str) -> List[str]:
 def get_channel_info(
-    channel_url: str, yt_dlp_path: str = "downloader/yt-dlp.exe"
+    channel_url: str, yt_dlp_path: str = None
 ) -> tuple[str, str]:
    """Get channel information using yt-dlp. Returns (channel_name, channel_id)."""
    # Use platform-aware path if none provided
    if yt_dlp_path is None:
        from karaoke_downloader.config_manager import load_config
        config = load_config()
        yt_dlp_path = config.yt_dlp_path
    try:
        # Extract channel name from URL for now (faster than calling yt-dlp)
        if "/@" in channel_url:
@ -52,9 +58,15 @@ def get_channel_info(
 def get_playlist_info(
-    playlist_url: str, yt_dlp_path: str = "downloader/yt-dlp.exe"
+    playlist_url: str, yt_dlp_path: str = None
 ) -> List[Dict[str, Any]]:
    """Get playlist information using yt-dlp."""
    # Use platform-aware path if none provided
    if yt_dlp_path is None:
        from karaoke_downloader.config_manager import load_config
        config = load_config()
        yt_dlp_path = config.yt_dlp_path
    try:
        cmd = _parse_yt_dlp_command(yt_dlp_path) + ["--dump-json", "--flat-playlist", playlist_url]
        result = subprocess.run(cmd, capture_output=True, text=True, check=True)
@ -88,7 +100,7 @@ def build_yt_dlp_command(
    Returns:
        List of command arguments for subprocess.run
    """
-    cmd = _parse_yt_dlp_command(yt_dlp_path) + [
+    cmd = _parse_yt_dlp_command(str(yt_dlp_path)) + [
        "--no-check-certificates",
        "--ignore-errors",
        "--no-warnings",
@ -129,7 +141,7 @@ def execute_yt_dlp_command(
 def show_available_formats(
-    video_url: str, yt_dlp_path: str = "downloader/yt-dlp.exe", timeout: int = 30
+    video_url: str, yt_dlp_path: str = None, timeout: int = 30
 ) -> None:
    """
    Show available formats for a video (debugging utility).
@ -139,8 +151,14 @@ def show_available_formats(
        yt_dlp_path: Path to yt-dlp executable
        timeout: Timeout in seconds
    """
    # Use platform-aware path if none provided
    if yt_dlp_path is None:
        from karaoke_downloader.config_manager import load_config
        config = load_config()
        yt_dlp_path = config.yt_dlp_path
    print(f"🔍 Checking available formats for: {video_url}")
-    format_cmd = _parse_yt_dlp_command(yt_dlp_path) + ["--list-formats", video_url]
+    format_cmd = _parse_yt_dlp_command(str(yt_dlp_path)) + ["--list-formats", video_url]
    try:
        format_result = subprocess.run(
            format_cmd, capture_output=True, text=True, timeout=timeout
--- a/setup_macos.py
+++ b/setup_macos.py
@ -1,220 +0,0 @@
 #!/usr/bin/env python3
 """
 macOS setup script for Karaoke Video Downloader.
 This script helps users set up yt-dlp and FFmpeg on macOS.
 """
 import os
 import sys
 import subprocess
 from pathlib import Path
 def check_ffmpeg():
    """Check if FFmpeg is installed."""
    try:
        result = subprocess.run(["ffmpeg", "-version"], capture_output=True, text=True, timeout=10)
        return result.returncode == 0
    except (subprocess.TimeoutExpired, FileNotFoundError):
        return False
 def check_yt_dlp():
    """Check if yt-dlp is installed via pip or binary."""
    # Check pip installation
    try:
        result = subprocess.run([sys.executable, "-m", "yt_dlp", "--version"], 
                              capture_output=True, text=True, timeout=10)
        if result.returncode == 0:
            return True
    except (subprocess.TimeoutExpired, subprocess.CalledProcessError):
        pass
    # Check binary file
    binary_path = Path("downloader/yt-dlp_macos")
    if binary_path.exists():
        try:
            result = subprocess.run([str(binary_path), "--version"], 
                                  capture_output=True, text=True, timeout=10)
            return result.returncode == 0
        except (subprocess.TimeoutExpired, subprocess.CalledProcessError):
            pass
    return False
 def install_ffmpeg():
    """Install FFmpeg via Homebrew."""
    print("🎬 Installing FFmpeg...")
    # Check if Homebrew is installed
    try:
        subprocess.run(["brew", "--version"], capture_output=True, check=True)
    except (subprocess.CalledProcessError, FileNotFoundError):
        print("❌ Homebrew is not installed. Please install Homebrew first:")
        print("   /bin/bash -c \"$(curl -fsSL https://raw.githubusercontent.com/Homebrew/install/HEAD/install.sh)\"")
        return False
    try:
        print("🍺 Installing FFmpeg via Homebrew...")
        result = subprocess.run(["brew", "install", "ffmpeg"], 
                              capture_output=True, text=True, check=True)
        print("✅ FFmpeg installed successfully!")
        return True
    except subprocess.CalledProcessError as e:
        print(f"❌ Failed to install FFmpeg: {e}")
        return False
 def download_yt_dlp_binary():
    """Download yt-dlp binary for macOS."""
    print("📥 Downloading yt-dlp binary for macOS...")
    # Create downloader directory if it doesn't exist
    downloader_dir = Path("downloader")
    downloader_dir.mkdir(exist_ok=True)
    # Download yt-dlp binary
    binary_path = downloader_dir / "yt-dlp_macos"
    url = "https://github.com/yt-dlp/yt-dlp/releases/latest/download/yt-dlp_macos"
    try:
        print(f"📡 Downloading from: {url}")
        result = subprocess.run(["curl", "-L", "-o", str(binary_path), url], 
                              capture_output=True, text=True, check=True)
        # Make it executable
        binary_path.chmod(0o755)
        print(f"✅ yt-dlp binary downloaded to: {binary_path}")
        # Test the binary
        test_result = subprocess.run([str(binary_path), "--version"], 
                                   capture_output=True, text=True, timeout=10)
        if test_result.returncode == 0:
            version = test_result.stdout.strip()
            print(f"✅ Binary test successful! Version: {version}")
            return True
        else:
            print(f"❌ Binary test failed: {test_result.stderr}")
            return False
    except subprocess.CalledProcessError as e:
        print(f"❌ Failed to download yt-dlp binary: {e}")
        return False
    except Exception as e:
        print(f"❌ Error downloading binary: {e}")
        return False
 def install_yt_dlp():
    """Install yt-dlp via pip."""
    print("📦 Installing yt-dlp...")
    try:
        result = subprocess.run([sys.executable, "-m", "pip", "install", "yt-dlp"], 
                              capture_output=True, text=True, check=True)
        print("✅ yt-dlp installed successfully!")
        return True
    except subprocess.CalledProcessError as e:
        print(f"❌ Failed to install yt-dlp: {e}")
        return False
 def test_installation():
    """Test the installation."""
    print("\n🧪 Testing installation...")
    # Test FFmpeg
    if check_ffmpeg():
        print("✅ FFmpeg is working!")
    else:
        print("❌ FFmpeg is not working")
        return False
    # Test yt-dlp
    if check_yt_dlp():
        print("✅ yt-dlp is working!")
    else:
        print("❌ yt-dlp is not working")
        return False
    return True
 def main():
    print("🍎 macOS Setup for Karaoke Video Downloader")
    print("=" * 50)
    # Check current status
    print("🔍 Checking current installation...")
    ffmpeg_installed = check_ffmpeg()
    yt_dlp_installed = check_yt_dlp()
    print(f"FFmpeg: {'✅ Installed' if ffmpeg_installed else '❌ Not installed'}")
    print(f"yt-dlp: {'✅ Installed' if yt_dlp_installed else '❌ Not installed'}")
    if ffmpeg_installed and yt_dlp_installed:
        print("\n🎉 Everything is already installed and working!")
        return
    # Install missing components
    print("\n🚀 Installing missing components...")
    # Install FFmpeg if needed
    if not ffmpeg_installed:
        print("\n🎬 FFmpeg Installation Options:")
        print("1. Install via Homebrew (recommended)")
        print("2. Download from ffmpeg.org")
        print("3. Skip FFmpeg installation")
        choice = input("\nChoose an option (1-3): ").strip()
        if choice == "1":
            if not install_ffmpeg():
                print("❌ FFmpeg installation failed")
                return
        elif choice == "2":
            print("📥 Please download FFmpeg from: https://ffmpeg.org/download.html")
            print("   Extract and add to your PATH, then run this script again.")
            return
        elif choice == "3":
            print("⚠️ FFmpeg is required for video processing. Some features may not work.")
        else:
            print("❌ Invalid choice")
            return
    # Install yt-dlp if needed
    if not yt_dlp_installed:
        print("\n📦 yt-dlp Installation Options:")
        print("1. Install via pip (recommended)")
        print("2. Download binary file")
        print("3. Skip yt-dlp installation")
        choice = input("\nChoose an option (1-3): ").strip()
        if choice == "1":
            if not install_yt_dlp():
                print("❌ yt-dlp installation failed")
                return
        elif choice == "2":
            if not download_yt_dlp_binary():
                print("❌ yt-dlp binary download failed")
                return
        elif choice == "3":
            print("❌ yt-dlp is required for video downloading.")
            return
        else:
            print("❌ Invalid choice")
            return
    # Test installation
    if test_installation():
        print("\n🎉 Setup completed successfully!")
        print("You can now use the Karaoke Video Downloader on macOS.")
        print("Run: python download_karaoke.py --help")
    else:
        print("\n❌ Setup failed. Please check the error messages above.")
 if __name__ == "__main__":
    main() 
--- a/setup_platform.py
+++ b/setup_platform.py
@ -0,0 +1,288 @@
 #!/usr/bin/env python3
 """
 Platform setup script for Karaoke Video Downloader.
 This script helps users download the correct yt-dlp binary for their platform.
 """
 import os
 import platform
 import sys
 import urllib.request
 import zipfile
 import tarfile
 from pathlib import Path
 def detect_platform():
    """Detect the current platform and return platform info."""
    system = platform.system().lower()
    machine = platform.machine().lower()
    if system == "windows":
        return "windows", "yt-dlp.exe"
    elif system == "darwin":
        return "macos", "yt-dlp_macos"
    elif system == "linux":
        return "linux", "yt-dlp"
    else:
        return "unknown", "yt-dlp"
 def get_download_url(platform_name):
    """Get the download URL for yt-dlp based on platform."""
    base_url = "https://github.com/yt-dlp/yt-dlp/releases/latest/download"
    if platform_name == "windows":
        return f"{base_url}/yt-dlp.exe"
    elif platform_name == "macos":
        return f"{base_url}/yt-dlp_macos"
    elif platform_name == "linux":
        return f"{base_url}/yt-dlp"
    else:
        raise ValueError(f"Unsupported platform: {platform_name}")
 def install_via_pip():
    """Install yt-dlp via pip."""
    print("📦 Installing yt-dlp via pip...")
    try:
        import subprocess
        result = subprocess.run([sys.executable, "-m", "pip", "install", "yt-dlp"], 
                              capture_output=True, text=True, check=True)
        print("✅ yt-dlp installed successfully via pip!")
        return True
    except subprocess.CalledProcessError as e:
        print(f"❌ Failed to install yt-dlp via pip: {e}")
        return False
 def check_ffmpeg():
    """Check if FFmpeg is installed and available."""
    try:
        import subprocess
        result = subprocess.run(["ffmpeg", "-version"], capture_output=True, text=True, timeout=10)
        return result.returncode == 0
    except (subprocess.TimeoutExpired, FileNotFoundError):
        return False
 def install_ffmpeg():
    """Install FFmpeg based on platform."""
    import subprocess
    platform_name, _ = detect_platform()
    print("🎬 Installing FFmpeg...")
    if platform_name == "macos":
        # Try using Homebrew first
        try:
            print("🍺 Attempting to install FFmpeg via Homebrew...")
            result = subprocess.run(["brew", "install", "ffmpeg"], 
                                  capture_output=True, text=True, check=True)
            print("✅ FFmpeg installed successfully via Homebrew!")
            return True
        except (subprocess.CalledProcessError, FileNotFoundError):
            print("⚠️ Homebrew not found or failed. Trying alternative methods...")
            # Try using MacPorts
            try:
                print("🍎 Attempting to install FFmpeg via MacPorts...")
                result = subprocess.run(["sudo", "port", "install", "ffmpeg"], 
                                      capture_output=True, text=True, check=True)
                print("✅ FFmpeg installed successfully via MacPorts!")
                return True
            except (subprocess.CalledProcessError, FileNotFoundError):
                print("❌ Could not install FFmpeg automatically.")
                print("Please install FFmpeg manually:")
                print("1. Install Homebrew: https://brew.sh/")
                print("2. Run: brew install ffmpeg")
                print("3. Or download from: https://ffmpeg.org/download.html")
                return False
    elif platform_name == "linux":
        try:
            print("🐧 Attempting to install FFmpeg via package manager...")
            # Try apt (Ubuntu/Debian)
            try:
                result = subprocess.run(["sudo", "apt", "update"], capture_output=True, text=True, check=True)
                result = subprocess.run(["sudo", "apt", "install", "-y", "ffmpeg"], 
                                      capture_output=True, text=True, check=True)
                print("✅ FFmpeg installed successfully via apt!")
                return True
            except subprocess.CalledProcessError:
                # Try yum (CentOS/RHEL)
                try:
                    result = subprocess.run(["sudo", "yum", "install", "-y", "ffmpeg"], 
                                          capture_output=True, text=True, check=True)
                    print("✅ FFmpeg installed successfully via yum!")
                    return True
                except subprocess.CalledProcessError:
                    print("❌ Could not install FFmpeg automatically.")
                    print("Please install FFmpeg manually for your Linux distribution.")
                    return False
        except FileNotFoundError:
            print("❌ Could not install FFmpeg automatically.")
            print("Please install FFmpeg manually for your Linux distribution.")
            return False
    elif platform_name == "windows":
        print("❌ FFmpeg installation not automated for Windows.")
        print("Please install FFmpeg manually:")
        print("1. Download from: https://ffmpeg.org/download.html")
        print("2. Extract to a folder and add to PATH")
        print("3. Or use Chocolatey: choco install ffmpeg")
        return False
    return False
 def download_file(url, destination):
    """Download a file from URL to destination."""
    print(f"📥 Downloading from: {url}")
    print(f"📁 Saving to: {destination}")
    try:
        urllib.request.urlretrieve(url, destination)
        print("✅ Download completed successfully!")
        return True
    except Exception as e:
        print(f"❌ Download failed: {e}")
        return False
 def make_executable(file_path):
    """Make a file executable (for Unix-like systems)."""
    try:
        os.chmod(file_path, 0o755)
        print(f"🔧 Made {file_path} executable")
    except Exception as e:
        print(f"⚠️ Could not make file executable: {e}")
 def main():
    print("🎤 Karaoke Video Downloader - Platform Setup")
    print("=" * 50)
    # Detect platform
    platform_name, binary_name = detect_platform()
    print(f"🖥️  Detected platform: {platform_name}")
    print(f"📦 Binary name: {binary_name}")
    # Create downloader directory if it doesn't exist
    downloader_dir = Path("downloader")
    downloader_dir.mkdir(exist_ok=True)
    # Check if binary already exists
    binary_path = downloader_dir / binary_name
    if binary_path.exists():
        print(f"✅ {binary_name} already exists in downloader/ directory")
        response = input("Do you want to re-download it? (y/N): ").strip().lower()
        if response != 'y':
            print("Setup completed!")
            return
    # Offer installation options
    print(f"\n🔧 Installation options for {platform_name}:")
    print("1. Download binary file (recommended for most users)")
    print("2. Install via pip (alternative method)")
    choice = input("Choose installation method (1 or 2): ").strip()
    if choice == "2":
        # Install via pip
        if install_via_pip():
            print(f"\n✅ yt-dlp installed successfully!")
            # Test the installation
            print(f"\n🧪 Testing yt-dlp installation...")
            try:
                import subprocess
                result = subprocess.run([sys.executable, "-m", "yt_dlp", "--version"], 
                                      capture_output=True, text=True, timeout=10)
                if result.returncode == 0:
                    version = result.stdout.strip()
                    print(f"✅ yt-dlp is working! Version: {version}")
                else:
                    print(f"⚠️ yt-dlp test failed: {result.stderr}")
            except Exception as e:
                print(f"⚠️ Could not test yt-dlp: {e}")
            # Check and install FFmpeg
            print(f"\n🎬 Checking FFmpeg installation...")
            if check_ffmpeg():
                print(f"✅ FFmpeg is already installed and working!")
            else:
                print(f"⚠️ FFmpeg not found. Installing...")
                if install_ffmpeg():
                    print(f"✅ FFmpeg installed successfully!")
                else:
                    print(f"⚠️ FFmpeg installation failed. The tool will still work but may be slower.")
            print(f"\n🎉 Setup completed successfully!")
            print(f"📦 yt-dlp installed via pip")
            print(f"🖥️  Platform: {platform_name}")
            print(f"\n🎉 You're ready to use the Karaoke Video Downloader!")
            print(f"Run: python download_karaoke.py --help")
            return
        else:
            print("❌ Pip installation failed. Trying binary download...")
    # Download binary file
    try:
        download_url = get_download_url(platform_name)
    except ValueError as e:
        print(f"❌ {e}")
        print("Please manually download yt-dlp for your platform from:")
        print("https://github.com/yt-dlp/yt-dlp/releases/latest")
        return
    # Download the binary
    print(f"\n🚀 Downloading yt-dlp for {platform_name}...")
    if download_file(download_url, binary_path):
        # Make executable on Unix-like systems
        if platform_name in ["macos", "linux"]:
            make_executable(binary_path)
        print(f"\n✅ yt-dlp binary downloaded successfully!")
        print(f"📁 yt-dlp binary location: {binary_path}")
        print(f"🖥️  Platform: {platform_name}")
        # Test the binary
        print(f"\n🧪 Testing yt-dlp installation...")
        try:
            import subprocess
            result = subprocess.run([str(binary_path), "--version"], 
                                  capture_output=True, text=True, timeout=10)
            if result.returncode == 0:
                version = result.stdout.strip()
                print(f"✅ yt-dlp is working! Version: {version}")
            else:
                print(f"⚠️ yt-dlp test failed: {result.stderr}")
        except Exception as e:
            print(f"⚠️ Could not test yt-dlp: {e}")
        # Check and install FFmpeg
        print(f"\n🎬 Checking FFmpeg installation...")
        if check_ffmpeg():
            print(f"✅ FFmpeg is already installed and working!")
        else:
            print(f"⚠️ FFmpeg not found. Installing...")
            if install_ffmpeg():
                print(f"✅ FFmpeg installed successfully!")
            else:
                print(f"⚠️ FFmpeg installation failed. The tool will still work but may be slower.")
        print(f"\n🎉 Setup completed successfully!")
        print(f"📁 yt-dlp binary location: {binary_path}")
        print(f"🖥️  Platform: {platform_name}")
        print(f"\n🎉 You're ready to use the Karaoke Video Downloader!")
        print(f"Run: python download_karaoke.py --help")
    else:
        print(f"\n❌ Setup failed. Please manually download yt-dlp for {platform_name}")
        print(f"Download URL: {download_url}")
        print(f"Save to: {binary_path}")
 if __name__ == "__main__":
    main() 
--- a/utilities/add_manual_video.py
+++ b/utilities/add_manual_video.py
@ -1,198 +0,0 @@
 #!/usr/bin/env python3
 """
 Helper script to add manual videos to the manual videos collection.
 """
 import json
 import re
 from pathlib import Path
 from typing import Dict, List, Optional
 from karaoke_downloader.data_path_manager import get_data_path_manager
 def extract_video_id(url: str) -> Optional[str]:
    """Extract video ID from YouTube URL."""
    patterns = [
        r'(?:youtube\.com/watch\?v=|youtu\.be/|youtube\.com/embed/)([a-zA-Z0-9_-]{11})',
        r'youtube\.com/watch\?.*v=([a-zA-Z0-9_-]{11})'
    ]
    for pattern in patterns:
        match = re.search(pattern, url)
        if match:
            return match.group(1)
    return None
 def add_manual_video(title: str, url: str, manual_file: str = None):
    if manual_file is None:
        manual_file = str(get_data_path_manager().get_manual_videos_path())
    """
    Add a manual video to the collection.
    Args:
        title: Video title (e.g., "Artist - Song (Karaoke Version)")
        url: YouTube URL
        manual_file: Path to manual videos JSON file
    """
    manual_path = Path(manual_file)
    # Load existing data or create new
    if manual_path.exists():
        with open(manual_path, 'r', encoding='utf-8') as f:
            data = json.load(f)
    else:
        data = {
            "channel_name": "@ManualVideos",
            "channel_url": "manual://static",
            "description": "Manual collection of individual karaoke videos",
            "videos": [],
            "parsing_rules": {
                "format": "artist_title_separator",
                "separator": " - ",
                "artist_first": true,
                "title_cleanup": {
                    "remove_suffix": {
                        "suffixes": ["(Karaoke)", "(Karaoke Version)", "(Karaoke Version) Lyrics"]
                    }
                }
            }
        }
    # Extract video ID
    video_id = extract_video_id(url)
    if not video_id:
        print(f"❌ Could not extract video ID from URL: {url}")
        return False
    # Check if video already exists
    existing_ids = [video.get("id") for video in data["videos"]]
    if video_id in existing_ids:
        print(f"⚠️  Video already exists: {title}")
        return False
    # Add new video
    new_video = {
        "title": title,
        "url": url,
        "id": video_id,
        "upload_date": "2024-01-01",  # Default date
        "duration": 180,  # Default duration
        "view_count": 1000  # Default view count
    }
    data["videos"].append(new_video)
    # Save updated data
    manual_path.parent.mkdir(parents=True, exist_ok=True)
    with open(manual_path, 'w', encoding='utf-8') as f:
        json.dump(data, f, indent=2, ensure_ascii=False)
    print(f"✅ Added video: {title}")
    print(f"   URL: {url}")
    print(f"   ID: {video_id}")
    return True
 def list_manual_videos(manual_file: str = None):
    if manual_file is None:
        manual_file = str(get_data_path_manager().get_manual_videos_path())
    """List all manual videos."""
    manual_path = Path(manual_file)
    if not manual_path.exists():
        print("❌ No manual videos file found")
        return
    with open(manual_path, 'r', encoding='utf-8') as f:
        data = json.load(f)
    print(f"📋 Manual Videos ({len(data['videos'])} videos):")
    print("=" * 60)
    for i, video in enumerate(data['videos'], 1):
        print(f"{i:2d}. {video['title']}")
        print(f"    URL: {video['url']}")
        print(f"    ID: {video['id']}")
        print()
 def remove_manual_video(video_id: str, manual_file: str = None):
    if manual_file is None:
        manual_file = str(get_data_path_manager().get_manual_videos_path())
    """Remove a manual video by ID."""
    manual_path = Path(manual_file)
    if not manual_path.exists():
        print("❌ No manual videos file found")
        return False
    with open(manual_path, 'r', encoding='utf-8') as f:
        data = json.load(f)
    # Find and remove video
    for i, video in enumerate(data['videos']):
        if video['id'] == video_id:
            removed_video = data['videos'].pop(i)
            with open(manual_path, 'w', encoding='utf-8') as f:
                json.dump(data, f, indent=2, ensure_ascii=False)
            print(f"✅ Removed video: {removed_video['title']}")
            return True
    print(f"❌ Video with ID '{video_id}' not found")
    return False
 def main():
    """Interactive mode for adding manual videos."""
    print("🎤 Manual Video Manager")
    print("=" * 30)
    print("1. Add video")
    print("2. List videos")
    print("3. Remove video")
    print("4. Exit")
    while True:
        choice = input("\nSelect option (1-4): ").strip()
        if choice == "1":
            title = input("Enter video title (e.g., 'Artist - Song (Karaoke Version)'): ").strip()
            url = input("Enter YouTube URL: ").strip()
            if title and url:
                add_manual_video(title, url)
            else:
                print("❌ Title and URL are required")
        elif choice == "2":
            list_manual_videos()
        elif choice == "3":
            video_id = input("Enter video ID to remove: ").strip()
            if video_id:
                remove_manual_video(video_id)
            else:
                print("❌ Video ID is required")
        elif choice == "4":
            print("👋 Goodbye!")
            break
        else:
            print("❌ Invalid option")
 if __name__ == "__main__":
    import sys
    if len(sys.argv) > 1:
        # Command line mode
        if sys.argv[1] == "add" and len(sys.argv) >= 4:
            add_manual_video(sys.argv[2], sys.argv[3])
        elif sys.argv[1] == "list":
            list_manual_videos()
        elif sys.argv[1] == "remove" and len(sys.argv) >= 3:
            remove_manual_video(sys.argv[2])
        else:
            print("Usage:")
            print("  python add_manual_video.py add 'Title' 'URL'")
            print("  python add_manual_video.py list")
            print("  python add_manual_video.py remove VIDEO_ID")
    else:
        # Interactive mode
        main() 
--- a/utilities/build_cache_from_raw.py
+++ b/utilities/build_cache_from_raw.py
@ -1,127 +0,0 @@
 #!/usr/bin/env python3
 """
 Script to build channel cache from raw yt-dlp output file.
 This uses the fixed parsing logic to handle titles with | characters.
 """
 import json
 import re
 from datetime import datetime
 from pathlib import Path
 from karaoke_downloader.data_path_manager import get_data_path_manager
 def parse_raw_output_file(raw_file_path):
    """Parse the raw output file and extract valid videos."""
    videos = []
    invalid_count = 0
    print(f"🔍 Parsing raw output file: {raw_file_path}")
    with open(raw_file_path, 'r', encoding='utf-8') as f:
        lines = f.readlines()
    # Skip header lines (lines starting with #)
    data_lines = [line for line in lines if not line.strip().startswith('#') and line.strip()]
    print(f"📄 Found {len(data_lines)} data lines to process")
    for i, line in enumerate(data_lines):
        if i % 1000 == 0 and i > 0:  # Progress indicator every 1000 lines
            print(f"📊 Processing line {i}/{len(data_lines)}... ({i/len(data_lines)*100:.1f}%)")
        # Remove line number prefix (e.g., "  1234: ")
        line = re.sub(r'^\s*\d+:\s*', '', line.strip())
        # More robust parsing that handles titles with | characters
        # Extract video ID directly from the URL that yt-dlp provides
        # Find the URL and extract video ID from it
        url_match = re.search(r'https://www\.youtube\.com/watch\?v=([a-zA-Z0-9_-]{11})', line)
        if not url_match:
            invalid_count += 1
            if invalid_count <= 5:
                print(f"⚠️  Skipping line with no URL: '{line[:100]}...'")
            elif invalid_count == 6:
                print(f"⚠️  ... and {len(data_lines) - i - 1} more invalid lines")
            continue
        # Extract video ID directly from the URL
        video_id = url_match.group(1)
        # Extract title (everything before the video ID in the line)
        title = line[:line.find(video_id)].rstrip('|').strip()
        # Validate video ID
        if video_id and (
            len(video_id) == 11 and 
            video_id.replace('-', '').replace('_', '').isalnum() and
            " " not in video_id and 
            "Lyrics" not in video_id and
            "KARAOKE" not in video_id.upper() and
            "Vocal" not in video_id and
            "Guide" not in video_id
        ):
            videos.append({"title": title, "id": video_id})
        else:
            invalid_count += 1
            if invalid_count <= 5:  # Only show first 5 invalid IDs
                print(f"⚠️  Skipping invalid video ID: '{video_id}' for title: '{title[:50]}...'")
            elif invalid_count == 6:
                print(f"⚠️  ... and {len(data_lines) - i - 1} more invalid IDs")
    print(f"✅ Parsed {len(videos)} valid videos from raw output")
    print(f"⚠️  Skipped {invalid_count} invalid video IDs")
    return videos
 def save_cache_file(channel_id, videos, cache_dir=None):
    if cache_dir is None:
        cache_dir = str(get_data_path_manager().get_channel_cache_dir())
    """Save the parsed videos to a cache file."""
    cache_dir = Path(cache_dir)
    cache_dir.mkdir(parents=True, exist_ok=True)
    # Sanitize channel ID for filename
    safe_channel_id = re.sub(r'[<>:"/\\|?*]', '_', channel_id)
    cache_file = cache_dir / f"{safe_channel_id}.json"
    data = {
        'channel_id': channel_id,
        'videos': videos,
        'last_updated': datetime.now().isoformat(),
        'video_count': len(videos)
    }
    with open(cache_file, 'w', encoding='utf-8') as f:
        json.dump(data, f, indent=2, ensure_ascii=False)
    print(f"💾 Saved cache to: {cache_file.name}")
    return cache_file
 def main():
    """Main function to build cache from raw output."""
    data_path_manager = get_data_path_manager()
    raw_file_path = data_path_manager.get_channel_cache_dir() / "@VocalStarKaraoke_raw_output.txt"
    if not raw_file_path.exists():
        print(f"❌ Raw output file not found: {raw_file_path}")
        return
    # Parse the raw output file
    videos = parse_raw_output_file(raw_file_path)
    if not videos:
        print("❌ No valid videos found")
        return
    # Save to cache file
    channel_id = "@VocalStarKaraoke"
    cache_file = save_cache_file(channel_id, videos)
    print(f"🎉 Cache build complete!")
    print(f"📊 Total videos in cache: {len(videos)}")
    print(f"📁 Cache file: {cache_file}")
 if __name__ == "__main__":
    main() 
--- a/utilities/cleanup_duplicate_files.py
+++ b/utilities/cleanup_duplicate_files.py
@ -1,164 +0,0 @@
 #!/usr/bin/env python3
 """
 Utility script to identify and clean up duplicate files with (2), (3) suffixes.
 This helps clean up files that were created before the duplicate prevention was implemented.
 """
 import json
 import re
 from pathlib import Path
 from typing import Dict, List, Tuple
 def find_duplicate_files(downloads_dir: str = "downloads") -> Dict[str, List[Path]]:
    """
    Find duplicate files with (2), (3), etc. suffixes in the downloads directory.
    Args:
        downloads_dir: Path to downloads directory
    Returns:
        Dictionary mapping base filenames to lists of duplicate files
    """
    downloads_path = Path(downloads_dir)
    if not downloads_path.exists():
        print(f"❌ Downloads directory not found: {downloads_dir}")
        return {}
    duplicates = {}
    # Scan all MP4 files in the downloads directory
    for mp4_file in downloads_path.rglob("*.mp4"):
        filename = mp4_file.name
        # Check if this is a duplicate file with (2), (3), etc.
        match = re.match(r'^(.+?)\s*\((\d+)\)\.mp4$', filename)
        if match:
            base_name = match.group(1)
            suffix_num = int(match.group(2))
            if base_name not in duplicates:
                duplicates[base_name] = []
            duplicates[base_name].append((mp4_file, suffix_num))
    # Sort duplicates by suffix number
    for base_name in duplicates:
        duplicates[base_name].sort(key=lambda x: x[1])
    return duplicates
 def analyze_duplicates(duplicates: Dict[str, List[Tuple[Path, int]]]) -> None:
    """
    Analyze and display information about found duplicates.
    Args:
        duplicates: Dictionary of duplicate files
    """
    if not duplicates:
        print("✅ No duplicate files found!")
        return
    print(f"🔍 Found {len(duplicates)} sets of duplicate files:")
    print()
    total_duplicates = 0
    for base_name, files in duplicates.items():
        print(f"📁 {base_name}")
        for file_path, suffix in files:
            file_size = file_path.stat().st_size / (1024 * 1024)  # MB
            print(f"   ({suffix}) {file_path.name} - {file_size:.1f} MB")
        print()
        total_duplicates += len(files) - 1  # -1 because we keep the original
    print(f"📊 Summary: {len(duplicates)} base files with {total_duplicates} duplicate files")
 def cleanup_duplicates(duplicates: Dict[str, List[Tuple[Path, int]]], dry_run: bool = True) -> None:
    """
    Clean up duplicate files, keeping only the first occurrence.
    Args:
        duplicates: Dictionary of duplicate files
        dry_run: If True, only show what would be deleted without actually deleting
    """
    if not duplicates:
        print("✅ No duplicates to clean up!")
        return
    mode = "DRY RUN" if dry_run else "ACTUAL CLEANUP"
    print(f"🧹 Starting {mode}...")
    print()
    total_deleted = 0
    total_size_freed = 0
    for base_name, files in duplicates.items():
        print(f"📁 Processing: {base_name}")
        # Keep the first file (lowest suffix number), delete the rest
        files_to_delete = files[1:]  # Skip the first file
        for file_path, suffix in files_to_delete:
            file_size = file_path.stat().st_size / (1024 * 1024)  # MB
            if dry_run:
                print(f"   🗑️  Would delete: {file_path.name} ({file_size:.1f} MB)")
            else:
                try:
                    file_path.unlink()
                    print(f"   ✅ Deleted: {file_path.name} ({file_size:.1f} MB)")
                    total_deleted += 1
                    total_size_freed += file_size
                except Exception as e:
                    print(f"   ❌ Failed to delete {file_path.name}: {e}")
        print()
    if dry_run:
        print(f"📊 DRY RUN SUMMARY: Would delete {len([f for files in duplicates.values() for f in files[1:]])} files")
    else:
        print(f"📊 CLEANUP SUMMARY: Deleted {total_deleted} files, freed {total_size_freed:.1f} MB")
 def main():
    """Main function to run the duplicate file cleanup."""
    print("🎵 Karaoke Video Downloader - Duplicate File Cleanup")
    print("=" * 50)
    print()
    # Find duplicates
    duplicates = find_duplicate_files()
    if not duplicates:
        print("✅ No duplicate files found!")
        return
    # Analyze duplicates
    analyze_duplicates(duplicates)
    print()
    # Ask user what to do
    while True:
        print("Options:")
        print("1. Dry run (show what would be deleted)")
        print("2. Actually delete duplicate files")
        print("3. Exit without doing anything")
        choice = input("\nEnter your choice (1-3): ").strip()
        if choice == "1":
            cleanup_duplicates(duplicates, dry_run=True)
            break
        elif choice == "2":
            confirm = input("⚠️  Are you sure you want to delete duplicate files? (yes/no): ").strip().lower()
            if confirm in ["yes", "y"]:
                cleanup_duplicates(duplicates, dry_run=False)
            else:
                print("❌ Cleanup cancelled.")
            break
        elif choice == "3":
            print("❌ Exiting without cleanup.")
            break
        else:
            print("❌ Invalid choice. Please enter 1, 2, or 3.")
 if __name__ == "__main__":
    main() 
--- a/utilities/fix_artist_name_format.py
+++ b/utilities/fix_artist_name_format.py
@ -1,465 +0,0 @@
 #!/usr/bin/env python3
 """
 Fix artist name formatting for Let's Sing Karaoke channel.
 This script specifically targets the "Last Name, First Name" format and converts it to
 "First Name Last Name" format in ID3 tags. It only processes entries where there is exactly one comma
 followed by exactly 2 words, to avoid affecting multi-artist entries.
 Usage:
    python fix_artist_name_format.py --preview  # Show what would be changed
    python fix_artist_name_format.py --apply    # Actually make the changes
    python fix_artist_name_format.py --external "D:\Karaoke\Karaoke\MP4\Let's Sing Karaoke"  # Use external directory
 """
 import json
 import os
 import re
 import shutil
 import argparse
 from pathlib import Path
 from typing import Dict, List, Tuple, Optional
 # Try to import mutagen for ID3 tag manipulation
 try:
    from mutagen.mp4 import MP4
    MUTAGEN_AVAILABLE = True
 except ImportError:
    MUTAGEN_AVAILABLE = False
    print("⚠️  mutagen not available - install with: pip install mutagen")
 def is_lastname_firstname_format(artist_name: str) -> bool:
    """
    Check if artist name is in "Last Name, First Name" format.
    Args:
        artist_name: The artist name to check
    Returns:
        True if the name matches "Last Name, First Name" format with exactly 2 words after comma
    """
    if ',' not in artist_name:
        return False
    # Split by comma
    parts = artist_name.split(',', 1)
    if len(parts) != 2:
        return False
    last_name = parts[0].strip()
    first_name_part = parts[1].strip()
    # Check if there are exactly 2 words after the comma
    words_after_comma = first_name_part.split()
    if len(words_after_comma) != 2:
        return False
    # Additional check: make sure it's not a multi-artist entry
    # If there are more than 2 words total in the artist name, it might be multi-artist
    total_words = len(artist_name.split())
    if total_words > 4:  # Last, First Name (4 words max for single artist)
        return False
    return True
 def convert_to_firstname_lastname(artist_name: str) -> str:
    """
    Convert "Last Name, First Name" to "First Name Last Name".
    Args:
        artist_name: Artist name in "Last Name, First Name" format
    Returns:
        Artist name in "First Name Last Name" format
    """
    parts = artist_name.split(',', 1)
    last_name = parts[0].strip()
    first_name_part = parts[1].strip()
    # Split the first name part into words
    words = first_name_part.split()
    if len(words) == 2:
        first_name = words[0]
        middle_name = words[1]
        return f"{first_name} {middle_name} {last_name}"
    else:
        # Fallback - just reverse the parts
        return f"{first_name_part} {last_name}"
 def extract_artist_title_from_filename(filename: str) -> Tuple[str, str]:
    """
    Extract artist and title from a filename.
    Args:
        filename: MP4 filename (without extension)
    Returns:
        Tuple of (artist, title)
    """
    # Remove .mp4 extension
    if filename.endswith('.mp4'):
        filename = filename[:-4]
    # Look for " - " separator
    if " - " in filename:
        parts = filename.split(" - ", 1)
        return parts[0].strip(), parts[1].strip()
    return "", filename
 def update_id3_tags(file_path: str, new_artist: str, apply_changes: bool = False) -> bool:
    """
    Update the ID3 tags in an MP4 file.
    Args:
        file_path: Path to the MP4 file
        new_artist: New artist name to set
        apply_changes: Whether to actually apply changes or just preview
    Returns:
        True if successful, False otherwise
    """
    if not MUTAGEN_AVAILABLE:
        print(f"⚠️  mutagen not available - cannot update ID3 tags for {file_path}")
        return False
    try:
        mp4 = MP4(file_path)
        if apply_changes:
            # Update the artist tag
            mp4["\xa9ART"] = new_artist
            mp4.save()
            print(f"📝 Updated ID3 tag: {os.path.basename(file_path)} → Artist: '{new_artist}'")
        else:
            # Just preview what would be changed
            current_artist = mp4.get("\xa9ART", ["Unknown"])[0] if "\xa9ART" in mp4 else "Unknown"
            print(f"📝 Would update ID3 tag: {os.path.basename(file_path)} → Artist: '{current_artist}' → '{new_artist}'")
        return True
    except Exception as e:
        print(f"❌ Failed to update ID3 tags for {file_path}: {e}")
        return False
 def scan_external_directory(directory_path: str) -> List[Dict]:
    """
    Scan external directory for MP4 files with "Last Name, First Name" format in ID3 tags.
    Args:
        directory_path: Path to the external directory
    Returns:
        List of files that need ID3 tag updates
    """
    if not os.path.exists(directory_path):
        print(f"❌ Directory not found: {directory_path}")
        return []
    if not MUTAGEN_AVAILABLE:
        print("❌ mutagen not available - cannot scan ID3 tags")
        return []
    files_to_update = []
    # Scan for MP4 files
    for file_path in Path(directory_path).glob("*.mp4"):
        try:
            mp4 = MP4(str(file_path))
            current_artist = mp4.get("\xa9ART", ["Unknown"])[0] if "\xa9ART" in mp4 else "Unknown"
            if current_artist and is_lastname_firstname_format(current_artist):
                new_artist = convert_to_firstname_lastname(current_artist)
                files_to_update.append({
                    'file_path': str(file_path),
                    'filename': file_path.name,
                    'old_artist': current_artist,
                    'new_artist': new_artist
                })
        except Exception as e:
            print(f"⚠️  Could not read ID3 tags from {file_path.name}: {e}")
    return files_to_update
 def update_tracking_file(tracking_file: str, channel_name: str = "Let's Sing Karaoke", apply_changes: bool = False) -> Tuple[int, List[Dict]]:
    """
    Update the karaoke tracking file to fix artist name formatting.
    Args:
        tracking_file: Path to the tracking JSON file
        channel_name: Channel name to target (default: Let's Sing Karaoke)
        apply_changes: Whether to actually apply changes or just preview
    Returns:
        Tuple of (number of changes made, list of changed entries)
    """
    if not os.path.exists(tracking_file):
        print(f"❌ Tracking file not found: {tracking_file}")
        return 0, []
    # Load the tracking data
    with open(tracking_file, 'r', encoding='utf-8') as f:
        data = json.load(f)
    changes_made = 0
    changed_entries = []
    # Process songs
    for song_key, song_data in data.get('songs', {}).items():
        if song_data.get('channel_name') != channel_name:
            continue
        artist = song_data.get('artist', '')
        if not artist or not is_lastname_firstname_format(artist):
            continue
        # Convert the artist name
        new_artist = convert_to_firstname_lastname(artist)
        if apply_changes:
            # Update the tracking data
            song_data['artist'] = new_artist
            # Update the video title if it exists and contains the old artist name
            video_title = song_data.get('video_title', '')
            if video_title and artist in video_title:
                song_data['video_title'] = video_title.replace(artist, new_artist)
            # Update the file path if it exists
            file_path = song_data.get('file_path', '')
            if file_path and artist in file_path:
                song_data['file_path'] = file_path.replace(artist, new_artist)
        changes_made += 1
        changed_entries.append({
            'song_key': song_key,
            'old_artist': artist,
            'new_artist': new_artist,
            'title': song_data.get('title', ''),
            'file_path': song_data.get('file_path', '')
        })
        print(f"🔄 {'Updated' if apply_changes else 'Would update'}: '{artist}' → '{new_artist}' ({song_data.get('title', '')})")
    # Save the updated data
    if apply_changes and changes_made > 0:
        # Create backup
        backup_file = f"{tracking_file}.backup"
        shutil.copy2(tracking_file, backup_file)
        print(f"💾 Created backup: {backup_file}")
        # Save updated file
        with open(tracking_file, 'w', encoding='utf-8') as f:
            json.dump(data, f, indent=2, ensure_ascii=False)
        print(f"💾 Updated tracking file: {tracking_file}")
    return changes_made, changed_entries
 def update_songlist_tracking(songlist_file: str, channel_name: str = "Let's Sing Karaoke", apply_changes: bool = False) -> Tuple[int, List[Dict]]:
    """
    Update the songlist tracking file to fix artist name formatting.
    Args:
        songlist_file: Path to the songlist tracking JSON file
        channel_name: Channel name to target (default: Let's Sing Karaoke)
        apply_changes: Whether to actually apply changes or just preview
    Returns:
        Tuple of (number of changes made, list of changed entries)
    """
    if not os.path.exists(songlist_file):
        print(f"❌ Songlist tracking file not found: {songlist_file}")
        return 0, []
    # Load the songlist data
    with open(songlist_file, 'r', encoding='utf-8') as f:
        data = json.load(f)
    changes_made = 0
    changed_entries = []
    # Process songlist entries
    for song_key, song_data in data.items():
        artist = song_data.get('artist', '')
        if not artist or not is_lastname_firstname_format(artist):
            continue
        # Convert the artist name
        new_artist = convert_to_firstname_lastname(artist)
        if apply_changes:
            # Update the songlist data
            song_data['artist'] = new_artist
        changes_made += 1
        changed_entries.append({
            'song_key': song_key,
            'old_artist': artist,
            'new_artist': new_artist,
            'title': song_data.get('title', '')
        })
        print(f"🔄 {'Updated' if apply_changes else 'Would update'} songlist: '{artist}' → '{new_artist}' ({song_data.get('title', '')})")
    # Save the updated data
    if apply_changes and changes_made > 0:
        # Create backup
        backup_file = f"{songlist_file}.backup"
        shutil.copy2(songlist_file, backup_file)
        print(f"💾 Created backup: {backup_file}")
        # Save updated file
        with open(songlist_file, 'w', encoding='utf-8') as f:
            json.dump(data, f, indent=2, ensure_ascii=False)
        print(f"💾 Updated songlist file: {songlist_file}")
    return changes_made, changed_entries
 def update_id3_tags_for_files(files_to_update: List[Dict], apply_changes: bool = False) -> int:
    """
    Update ID3 tags for a list of files.
    Args:
        files_to_update: List of files to update
        apply_changes: Whether to actually apply changes or just preview
    Returns:
        Number of files successfully updated
    """
    updated_count = 0
    for file_info in files_to_update:
        file_path = file_info['file_path']
        new_artist = file_info['new_artist']
        if update_id3_tags(file_path, new_artist, apply_changes):
            updated_count += 1
    return updated_count
 def main():
    """Main function to run the artist name fix script."""
    parser = argparse.ArgumentParser(description="Fix artist name formatting in ID3 tags for Let's Sing Karaoke")
    parser.add_argument('--preview', action='store_true', help='Show what would be changed without making changes')
    parser.add_argument('--apply', action='store_true', help='Actually apply the changes')
    parser.add_argument('--external', type=str, help='Path to external karaoke directory')
    args = parser.parse_args()
    # Default to preview mode if no action specified
    if not args.preview and not args.apply:
        args.preview = True
    print("🎤 Artist Name Format Fix Script (ID3 Tags Only)")
    print("=" * 60)
    print("This script will fix 'Last Name, First Name' format to 'First Name Last Name'")
    print("Only targeting Let's Sing Karaoke channel to avoid affecting other channels.")
    print("Focusing on ID3 tags only - filenames will not be changed.")
    print()
    if not MUTAGEN_AVAILABLE:
        print("❌ mutagen library not available!")
        print("Please install it with: pip install mutagen")
        return
    if args.preview:
        print("🔍 PREVIEW MODE - No changes will be made")
    else:
        print("⚡ APPLY MODE - Changes will be made")
    print()
    # File paths
    tracking_file = "data/karaoke_tracking.json"
    songlist_file = "data/songlist_tracking.json"
    # Process external directory if specified
    if args.external:
        print(f"📁 Scanning external directory: {args.external}")
        external_files = scan_external_directory(args.external)
        if external_files:
            print(f"\n📋 Found {len(external_files)} files with 'Last Name, First Name' format in ID3 tags:")
            for file_info in external_files:
                print(f"   • {file_info['filename']}: '{file_info['old_artist']}' → '{file_info['new_artist']}'")
            if args.apply:
                print(f"\n📝 Updating ID3 tags in external files...")
                updated_count = update_id3_tags_for_files(external_files, apply_changes=True)
                print(f"✅ Updated ID3 tags in {updated_count} external files")
            else:
                print(f"\n📝 Would update ID3 tags in {len(external_files)} external files")
        else:
            print("✅ No files with 'Last Name, First Name' format found in ID3 tags")
    # Process tracking files (only if they exist in current project)
    if os.path.exists(tracking_file):
        print(f"\n📊 Processing karaoke tracking file...")
        tracking_changes, tracking_entries = update_tracking_file(tracking_file, apply_changes=args.apply)
    else:
        print(f"\n⚠️  Tracking file not found: {tracking_file}")
        tracking_changes = 0
    if os.path.exists(songlist_file):
        print(f"\n📊 Processing songlist tracking file...")
        songlist_changes, songlist_entries = update_songlist_tracking(songlist_file, apply_changes=args.apply)
    else:
        print(f"\n⚠️  Songlist tracking file not found: {songlist_file}")
        songlist_changes = 0
    # Process local downloads directory ID3 tags
    downloads_dir = "downloads"
    local_id3_updates = 0
    if os.path.exists(downloads_dir) and tracking_changes > 0:
        print(f"\n📝 Processing ID3 tags in local downloads directory...")
        # Scan local downloads for files that need ID3 tag updates
        local_files = []
        for entry in tracking_entries:
            file_path = entry.get('file_path', '')
            if file_path and os.path.exists(file_path.replace('\\', '/')):
                local_files.append({
                    'file_path': file_path.replace('\\', '/'),
                    'filename': os.path.basename(file_path),
                    'old_artist': entry['old_artist'],
                    'new_artist': entry['new_artist']
                })
        if local_files:
            local_id3_updates = update_id3_tags_for_files(local_files, apply_changes=args.apply)
    total_changes = tracking_changes + songlist_changes
    print("\n" + "=" * 60)
    print("📋 Summary:")
    print(f"   • Tracking file changes: {tracking_changes}")
    print(f"   • Songlist file changes: {songlist_changes}")
    print(f"   • Local ID3 tag updates: {local_id3_updates}")
    print(f"   • Total changes: {total_changes}")
    if args.external:
        external_count = len(scan_external_directory(args.external)) if args.preview else len(external_files)
        print(f"   • External ID3 tag updates: {external_count}")
    if total_changes > 0 or (args.external and external_count > 0):
        if args.apply:
            print("\n✅ Artist name formatting in ID3 tags has been fixed!")
            print("💾 Backups have been created for all modified files.")
            print("🔄 You may need to re-run your karaoke downloader to update any cached data.")
        else:
            print("\n🔍 Preview complete. Use --apply to make these changes.")
    else:
        print("\n✅ No changes needed! All artist names are already in the correct format.")
 if __name__ == "__main__":
    main() 
--- a/utilities/fix_artist_name_format_simple.py
+++ b/utilities/fix_artist_name_format_simple.py
@ -1,295 +0,0 @@
 #!/usr/bin/env python3
 """
 Fix artist name formatting for Let's Sing Karaoke channel.
 This script specifically targets the "Last Name, First Name" format and converts it to
 "First Name Last Name" format in ID3 tags. It only processes entries where there is exactly one comma
 followed by exactly 2 words, to avoid affecting multi-artist entries.
 Usage:
    python fix_artist_name_format_simple.py --preview  # Show what would be changed
    python fix_artist_name_format_simple.py --apply    # Actually make the changes
    python fix_artist_name_format_simple.py --external "D:\Karaoke\Karaoke\MP4\Let's Sing Karaoke"  # Use external directory
 """
 import json
 import os
 import re
 import shutil
 import argparse
 from pathlib import Path
 from typing import Dict, List, Tuple, Optional
 # Try to import mutagen for ID3 tag manipulation
 try:
    from mutagen.mp4 import MP4
    MUTAGEN_AVAILABLE = True
 except ImportError:
    MUTAGEN_AVAILABLE = False
    print("WARNING: mutagen not available - install with: pip install mutagen")
 def is_lastname_firstname_format(artist_name: str) -> bool:
    """
    Check if artist name is in "Last Name, First Name" format.
    Args:
        artist_name: The artist name to check
    Returns:
        True if the name matches "Last Name, First Name" format with exactly 1 or 2 words after comma
    """
    if ',' not in artist_name:
        return False
    # Split by comma
    parts = artist_name.split(',', 1)
    if len(parts) != 2:
        return False
    last_name = parts[0].strip()
    first_name_part = parts[1].strip()
    # Check if there are exactly 1 or 2 words after the comma
    words_after_comma = first_name_part.split()
    if len(words_after_comma) not in [1, 2]:
        return False
    # Additional check: make sure it's not a multi-artist entry
    # If there are more than 4 words total in the artist name, it might be multi-artist
    total_words = len(artist_name.split())
    if total_words > 4:  # Last, First Name (4 words max for single artist)
        return False
    return True
 def convert_lastname_firstname(artist_name: str) -> str:
    """
    Convert "Last Name, First Name" to "First Name Last Name".
    Args:
        artist_name: The artist name to convert
    Returns:
        The converted artist name
    """
    if ',' not in artist_name:
        return artist_name
    parts = artist_name.split(',', 1)
    if len(parts) != 2:
        return artist_name
    last_name = parts[0].strip()
    first_name = parts[1].strip()
    return f"{first_name} {last_name}"
 def process_artist_name(artist_name: str) -> str:
    """
    Process an artist name, handling both single artists and multiple artists separated by "&".
    Args:
        artist_name: The artist name to process
    Returns:
        The processed artist name
    """
    if '&' in artist_name:
        # Split by "&" and process each artist individually
        artists = [artist.strip() for artist in artist_name.split('&')]
        processed_artists = []
        for artist in artists:
            if is_lastname_firstname_format(artist):
                processed_artist = convert_lastname_firstname(artist)
                processed_artists.append(processed_artist)
            else:
                processed_artists.append(artist)
        # Rejoin with "&"
        return ' & '.join(processed_artists)
    else:
        # Single artist
        if is_lastname_firstname_format(artist_name):
            return convert_lastname_firstname(artist_name)
        else:
            return artist_name
 def update_id3_tags(file_path: str, new_artist: str, apply_changes: bool = False) -> bool:
    """
    Update the ID3 tags in an MP4 file.
    Args:
        file_path: Path to the MP4 file
        new_artist: New artist name to set
        apply_changes: Whether to actually apply changes or just preview
    Returns:
        True if successful, False otherwise
    """
    if not MUTAGEN_AVAILABLE:
        print(f"WARNING: mutagen not available - cannot update ID3 tags for {file_path}")
        return False
    try:
        mp4 = MP4(file_path)
        if apply_changes:
            # Update the artist tag
            mp4["\xa9ART"] = new_artist
            mp4.save()
            print(f"UPDATED ID3 tag: {os.path.basename(file_path)} -> Artist: '{new_artist}'")
        else:
            # Just preview what would be changed
            current_artist = mp4.get("\xa9ART", ["Unknown"])[0] if "\xa9ART" in mp4 else "Unknown"
            print(f"WOULD UPDATE ID3 tag: {os.path.basename(file_path)} -> Artist: '{current_artist}' -> '{new_artist}'")
        return True
    except Exception as e:
        print(f"ERROR: Failed to update ID3 tags for {file_path}: {e}")
        return False
 def scan_external_directory(directory_path: str, debug: bool = False) -> List[Dict]:
    """
    Scan external directory for MP4 files with "Last Name, First Name" format in ID3 tags.
    Args:
        directory_path: Path to the external directory
        debug: Whether to show debug information
    Returns:
        List of files that need ID3 tag updates
    """
    if not os.path.exists(directory_path):
        print(f"ERROR: Directory not found: {directory_path}")
        return []
    if not MUTAGEN_AVAILABLE:
        print("ERROR: mutagen not available - cannot scan ID3 tags")
        return []
    files_to_update = []
    total_files = 0
    files_with_artist_tags = 0
    # Scan for MP4 files
    for file_path in Path(directory_path).glob("*.mp4"):
        total_files += 1
        try:
            mp4 = MP4(str(file_path))
            current_artist = mp4.get("\xa9ART", ["Unknown"])[0] if "\xa9ART" in mp4 else "Unknown"
            if current_artist != "Unknown":
                files_with_artist_tags += 1
                if debug:
                    print(f"DEBUG: {file_path.name} -> Artist: '{current_artist}'")
                # Process the artist name to handle multiple artists
                processed_artist = process_artist_name(current_artist)
                if processed_artist != current_artist:
                    files_to_update.append({
                        'file_path': str(file_path),
                        'filename': file_path.name,
                        'old_artist': current_artist,
                        'new_artist': processed_artist
                    })
                    if debug:
                        print(f"DEBUG: MATCH FOUND - {file_path.name}: '{current_artist}' -> '{processed_artist}'")
        except Exception as e:
            if debug:
                print(f"WARNING: Could not read ID3 tags from {file_path.name}: {e}")
    print(f"INFO: Scanned {total_files} MP4 files, {files_with_artist_tags} had artist tags, {len(files_to_update)} need updates")
    return files_to_update
 def update_id3_tags_for_files(files_to_update: List[Dict], apply_changes: bool = False) -> int:
    """
    Update ID3 tags for a list of files.
    Args:
        files_to_update: List of files to update
        apply_changes: Whether to actually apply changes or just preview
    Returns:
        Number of files successfully updated
    """
    updated_count = 0
    for file_info in files_to_update:
        file_path = file_info['file_path']
        new_artist = file_info['new_artist']
        if update_id3_tags(file_path, new_artist, apply_changes):
            updated_count += 1
    return updated_count
 def main():
    """Main function to run the artist name fix script."""
    parser = argparse.ArgumentParser(description="Fix artist name formatting in ID3 tags for Let's Sing Karaoke")
    parser.add_argument('--preview', action='store_true', help='Show what would be changed without making changes')
    parser.add_argument('--apply', action='store_true', help='Actually apply the changes')
    parser.add_argument('--external', type=str, help='Path to external karaoke directory')
    parser.add_argument('--debug', action='store_true', help='Show debug information')
    args = parser.parse_args()
    # Default to preview mode if no action specified
    if not args.preview and not args.apply:
        args.preview = True
    print("Artist Name Format Fix Script (ID3 Tags Only)")
    print("=" * 60)
    print("This script will fix 'Last Name, First Name' format to 'First Name Last Name'")
    print("Only targeting Let's Sing Karaoke channel to avoid affecting other channels.")
    print("Focusing on ID3 tags only - filenames will not be changed.")
    print()
    if not MUTAGEN_AVAILABLE:
        print("ERROR: mutagen library not available!")
        print("Please install it with: pip install mutagen")
        return
    if args.preview:
        print("PREVIEW MODE - No changes will be made")
    else:
        print("APPLY MODE - Changes will be made")
    print()
    # Process external directory if specified
    if args.external:
        print(f"Scanning external directory: {args.external}")
        external_files = scan_external_directory(args.external, debug=args.debug)
        if external_files:
            print(f"\nFound {len(external_files)} files with 'Last Name, First Name' format in ID3 tags:")
            for file_info in external_files:
                print(f"  * {file_info['filename']}: '{file_info['old_artist']}' -> '{file_info['new_artist']}'")
            if args.apply:
                print(f"\nUpdating ID3 tags in external files...")
                updated_count = update_id3_tags_for_files(external_files, apply_changes=True)
                print(f"SUCCESS: Updated ID3 tags in {updated_count} external files")
            else:
                print(f"\nWould update ID3 tags in {len(external_files)} external files")
        else:
            print("SUCCESS: No files with 'Last Name, First Name' format found in ID3 tags")
    print("\n" + "=" * 60)
    print("Summary complete.")
 if __name__ == "__main__":
    main() 
--- a/utilities/reset_and_redownload.py
+++ b/utilities/reset_and_redownload.py
@ -1,151 +0,0 @@
 #!/usr/bin/env python3
 """
 Script to reset karaoke tracking and re-download files with the new channel parser.
 This script will:
 1. Reset the karaoke_tracking.json to remove all downloaded entries
 2. Optionally delete the downloaded files
 3. Allow you to re-download with the new channel parser system
 """
 import json
 import os
 import shutil
 from pathlib import Path
 from typing import List, Dict, Any
 from karaoke_downloader.data_path_manager import get_data_path_manager
 def reset_karaoke_tracking(tracking_file: str = None) -> None:
    if tracking_file is None:
        tracking_file = str(get_data_path_manager().get_karaoke_tracking_path())
    """Reset the karaoke tracking file to empty state."""
    print(f"Resetting {tracking_file}...")
    # Create backup of current tracking
    backup_file = f"{tracking_file}.backup"
    if os.path.exists(tracking_file):
        shutil.copy2(tracking_file, backup_file)
        print(f"Created backup: {backup_file}")
    # Reset to empty state
    empty_tracking = {
        "playlists": {},
        "songs": {}
    }
    with open(tracking_file, 'w', encoding='utf-8') as f:
        json.dump(empty_tracking, f, indent=2, ensure_ascii=False)
    print(f"✅ Reset {tracking_file} to empty state")
 def delete_downloaded_files(downloads_dir: str = "downloads") -> None:
    """Delete all downloaded files and folders."""
    if not os.path.exists(downloads_dir):
        print(f"Downloads directory {downloads_dir} does not exist.")
        return
    print(f"Deleting all files in {downloads_dir}...")
    try:
        shutil.rmtree(downloads_dir)
        print(f"✅ Deleted {downloads_dir} directory")
    except Exception as e:
        print(f"❌ Error deleting {downloads_dir}: {e}")
 def show_download_stats(tracking_file: str = None) -> None:
    if tracking_file is None:
        tracking_file = str(get_data_path_manager().get_karaoke_tracking_path())
    """Show statistics about current downloads."""
    if not os.path.exists(tracking_file):
        print("No tracking file found.")
        return
    with open(tracking_file, 'r', encoding='utf-8') as f:
        tracking = json.load(f)
    songs = tracking.get("songs", {})
    total_songs = len(songs)
    if total_songs == 0:
        print("No songs in tracking file.")
        return
    # Count by status
    status_counts = {}
    channel_counts = {}
    for song_id, song_data in songs.items():
        status = song_data.get("status", "UNKNOWN")
        channel = song_data.get("channel_name", "UNKNOWN")
        status_counts[status] = status_counts.get(status, 0) + 1
        channel_counts[channel] = channel_counts.get(channel, 0) + 1
    print(f"\n📊 Current Download Statistics:")
    print(f"Total songs: {total_songs}")
    print(f"\nBy Status:")
    for status, count in status_counts.items():
        print(f"  {status}: {count}")
    print(f"\nBy Channel:")
    for channel, count in channel_counts.items():
        print(f"  {channel}: {count}")
 def main():
    """Main function to handle reset and re-download process."""
    print("🔄 Karaoke Download Reset and Re-download Tool")
    print("=" * 50)
    # Show current stats
    print("\nCurrent download statistics:")
    show_download_stats()
    # Ask user what they want to do
    print("\nOptions:")
    print("1. Reset tracking only (keep files)")
    print("2. Reset tracking and delete all downloaded files")
    print("3. Show current stats only")
    print("4. Exit")
    choice = input("\nEnter your choice (1-4): ").strip()
    if choice == "1":
        print("\n🔄 Resetting tracking only...")
        reset_karaoke_tracking()
        print("\n✅ Tracking reset complete!")
        print("You can now re-download files with the new channel parser system.")
        print("\nTo re-download, run:")
        print("python download_karaoke.py --file data/channels.txt --limit 50")
    elif choice == "2":
        print("\n🔄 Resetting tracking and deleting files...")
        confirm = input("Are you sure you want to delete ALL downloaded files? (yes/no): ").strip().lower()
        if confirm == "yes":
            reset_karaoke_tracking()
            delete_downloaded_files()
            print("\n✅ Reset complete! All tracking and files have been removed.")
            print("You can now re-download files with the new channel parser system.")
            print("\nTo re-download, run:")
            print("python download_karaoke.py --file data/channels.txt --limit 50")
        else:
            print("Operation cancelled.")
    elif choice == "3":
        print("\n📊 Current statistics:")
        show_download_stats()
    elif choice == "4":
        print("Exiting...")
    else:
        print("Invalid choice. Please enter 1, 2, 3, or 4.")
 if __name__ == "__main__":
    main()