Signed-off-by: Matt Bruce <mbrucedogs@gmail.com>

2025-08-11 09:01:31 -05:00 · 2025-08-11 09:00:46 -05:00 · 2025-08-10 10:28:29 -05:00 · 2025-08-05 16:31:03 -05:00 · 2025-08-05 16:30:20 -05:00 · 2025-08-05 16:11:29 -05:00
59 changed files with 444314 additions and 179720 deletions
--- a/.gitignore
+++ b/.gitignore
@ -14,9 +14,6 @@ logs/
 *.log

 # Tracking and cache files
-karaoke_tracking.json
-karaoke_tracking.json.backup
-songlist_tracking.json
 *.cache

 # yt-dlp temporary files
--- a/PRD.md
+++ b/PRD.md
@ -1,8 +1,8 @@

-# 🎤 Karaoke Video Downloader – PRD (v3.3)
+# 🎤 Karaoke Video Downloader – PRD (v3.4.4)

 ## ✅ Overview
-A Python-based Windows CLI tool to download karaoke videos from YouTube channels/playlists using `yt-dlp.exe`, with advanced tracking, songlist prioritization, and flexible configuration. The codebase has been comprehensively refactored into a modular architecture with centralized utilities for improved maintainability, error handling, and code reuse.
+A Python-based cross-platform CLI tool to download karaoke videos from YouTube channels/playlists using `yt-dlp`, with advanced tracking, songlist prioritization, and flexible configuration. Supports Windows and macOS with automatic platform detection. The codebase has been comprehensively refactored into a modular architecture with centralized utilities for improved maintainability, error handling, and code reuse.

 ---

@ -63,9 +63,9 @@ The codebase has been refactored into focused modules with centralized utilities
 ---

 ## ⚙️ Platform & Stack
- **Platform:** Windows
+- **Platform:** Windows, macOS
 - **Interface:** Command-line (CLI)
- **Tech Stack:** Python 3.7+, yt-dlp.exe, mutagen (for ID3 tagging)
+- **Tech Stack:** Python 3.7+, yt-dlp (platform-specific binary), mutagen (for ID3 tagging)

 ---

@ -101,6 +101,7 @@ python download_karaoke.py --clear-cache SingKingKaraoke
 - ✅ Songlist integration: prioritize and track custom songlists
 - ✅ Songlist-only mode: download only songs from the songlist
 - ✅ Songlist focus mode: download only songs from specific playlists by title
+- ✅ Force download mode: bypass all existing file checks and re-download songs regardless of server duplicates or existing files
 - ✅ Global songlist tracking to avoid duplicates across channels
 - ✅ ID3 tagging for artist/title in MP4 files (mutagen)
 - ✅ Real-time progress and detailed logging
@ -122,6 +123,8 @@ python download_karaoke.py --clear-cache SingKingKaraoke
 - ✅ **Centralized file operations**: Single source of truth for filename sanitization, file validation, and path operations
 - ✅ **Centralized song validation**: Unified logic for checking if songs should be downloaded across all modules
 - ✅ **Enhanced configuration management**: Structured configuration with dataclasses, type safety, and validation
+- ✅ **Manual video collection**: Static video collection system for managing individual karaoke videos that don't belong to regular channels. Use `--manual` to download from `data/manual_videos.json`.
+- ✅ **Channel-specific parsing rules**: JSON-based configuration for parsing video titles from different YouTube channels, with support for various title formats and cleanup rules.

 ---

@ -149,19 +152,34 @@ KaroakeVideoDownloader/
 │   ├── check_resolution.py     # Resolution checker utility
 │   ├── resolution_cli.py       # Resolution config CLI
 │   └── tracking_cli.py         # Tracking management CLI
-├── data/                      # All config, tracking, cache, and songlist files
-│   ├── config.json
+├── config/                   # Configuration files
+│   └── config.json          # Main configuration file
+├── data/                     # All tracking, cache, and songlist files
 │   ├── karaoke_tracking.json
 │   ├── songlist_tracking.json
 │   ├── channel_cache.json
-│   ├── channels.txt
+│   ├── channels.json          # Channel configuration with parsing rules
+│   ├── manual_videos.json     # Manual video collection
 │   └── songList.json
+├── utilities/                # Utility scripts and tools
+│   ├── add_manual_video.py  # Manual video management
+│   ├── build_cache_from_raw.py # Cache building utility
+│   ├── cleanup_duplicate_files.py # File cleanup utilities
+│   ├── cleanup_recent_tracking.py # Tracking cleanup utilities
+│   ├── deduplicate_songlist_tracking.py # Data deduplication
+│   ├── fix_artist_name_format.py # Data cleanup utilities
+│   ├── fix_artist_name_format_simple.py
+│   ├── fix_code_quality.py  # Development tools
+│   ├── reset_and_redownload.py # Maintenance utilities
+│   └── songlist_report.py   # Reporting utilities
 ├── downloads/                 # All video output
 │   └── [ChannelName]/         # Per-channel folders
 ├── logs/                      # Download logs
-├── downloader/yt-dlp.exe      # yt-dlp binary
-├── tests/                     # Diagnostic and test scripts
-│   └── test_installation.py
+├── downloader/yt-dlp.exe      # yt-dlp binary (Windows)
+├── downloader/yt-dlp_macos    # yt-dlp binary (macOS)
+├── src/tests/                 # Test scripts
+│   ├── test_macos.py         # macOS setup and functionality tests
+│   └── test_platform.py      # Platform detection tests
 ├── download_karaoke.py        # Main entry point (thin wrapper)
 ├── README.md
 ├── PRD.md
@ -176,6 +194,8 @@ KaroakeVideoDownloader/
 - `--songlist-priority`: Prioritize songlist songs in download queue
 - `--songlist-only`: Download only songs from the songlist
 - `--songlist-focus <PLAYLIST_TITLE1> <PLAYLIST_TITLE2>...`: Focus on specific playlists by title (e.g., `--songlist-focus "2025 - Apple Top 50" "2024 - Billboard Hot 100"`)
+- `--songlist-file <FILE_PATH>`: Custom songlist file path to use with --songlist-focus (default: data/songList.json)
+- `--force`: **Force download from channels, bypassing all existing file checks and re-downloading if necessary**
 - `--songlist-status`: Show songlist download progress
 - `--limit <N>`: Limit number of downloads (enables fast mode with early exit)
 - `--resolution <720p|1080p|...>`: Override resolution
@ -188,7 +208,11 @@ KaroakeVideoDownloader/
 - `--fuzzy-match`: **Enable fuzzy matching for songlist-to-video matching (uses rapidfuzz if available)**
 - `--fuzzy-threshold <N>`: **Fuzzy match threshold (0-100, default 85)**
 - `--parallel`: **Enable parallel downloads for improved speed**
- `--workers <N>`: **Number of parallel download workers (1-10, default: 3)**
+- `--workers <N>`: **Number of parallel download workers (1-10, default: 3, only used with --parallel)**
+- `--manual`: **Download from manual videos collection (data/manual_videos.json)**
+- `--channel-focus <CHANNEL_NAME>`: **Download from a specific channel by name (e.g., 'SingKingKaraoke')**
+- `--all-videos`: **Download all videos from channel (not just songlist matches), skipping existing files and songs in songs.json**
+- `--dry-run`: **Build download plan and show what would be downloaded without actually downloading anything**

 ---

@ -199,6 +223,8 @@ KaroakeVideoDownloader/
 - **ID3 Tagging:** Artist/title extracted from video title and embedded in MP4 files.
 - **Cleanup:** Extra files from yt-dlp (e.g., `.info.json`) are automatically removed after download.
 - **Reset/Clear:** Use `--reset-channel` to reset all tracking and files for a channel (optionally including songlist songs with `--reset-songlist`). Use `--clear-cache` to clear cached video lists for a channel or all channels.
+- **Channel-Specific Parsing:** Uses `data/channels.json` to define parsing rules for each YouTube channel, handling different video title formats (e.g., "Artist - Title", "Artist   Title", "Title | Artist", etc.).
+- **Manual Video Collection:** Static video management system using `data/manual_videos.json` for individual karaoke videos that don't belong to regular channels. Accessible via `--manual` parameter.

 ## 🔧 Refactoring Improvements (v3.3)
 The codebase has been comprehensively refactored to improve maintainability and reduce code duplication. Recent improvements have enhanced reliability, performance, and code organization:
@ -252,7 +278,7 @@ The codebase has been comprehensively refactored to improve maintainability and

 ### **New Parallel Download System (v3.4)**
 - **Parallel downloader module:** `parallel_downloader.py` provides thread-safe concurrent download management
- **Configurable concurrency:** Use `--parallel --workers N` to enable parallel downloads with N workers (1-10)
+- **Configurable concurrency:** Use `--parallel` to enable parallel downloads with 3 workers by default, or `--parallel --workers N` for custom worker count (1-10)
 - **Thread-safe operations:** All tracking, caching, and progress operations are thread-safe
 - **Real-time progress tracking:** Shows active downloads, completion status, and overall progress
 - **Automatic retry mechanism:** Failed downloads are automatically retried with reduced concurrency
@ -268,8 +294,337 @@ The codebase has been comprehensively refactored to improve maintainability and
 - [ ] Download scheduling and retry logic
 - [ ] More granular status reporting
 - [x] **Parallel downloads for improved speed** ✅ **COMPLETED**
+- [x] **Enhanced fuzzy matching with improved video title parsing** ✅ **COMPLETED**
+- [x] **Consolidated extract_artist_title function** ✅ **COMPLETED**
+- [x] **Duplicate file prevention and filename consistency** ✅ **COMPLETED**
 - [ ] Unit tests for all modules
 - [ ] Integration tests for end-to-end workflows
 - [ ] Plugin system for custom file operations
 - [ ] Advanced configuration UI
 - [ ] Real-time download progress visualization
+
+## 🔧 Recent Bug Fixes & Improvements (v3.4.1)
+### **Enhanced Fuzzy Matching (v3.4.1)**
+- **Improved `extract_artist_title` function**: Enhanced to handle multiple video title formats beyond simple "Artist - Title" patterns
+  - **"Title Karaoke | Artist Karaoke Version" format**: Correctly parses titles like "Hold On Loosely Karaoke | 38 Special Karaoke Version"
+  - **"Title Artist KARAOKE" format**: Handles titles ending with "KARAOKE" and attempts to extract artist information
+  - **Fallback handling**: Returns empty artist and full title for unparseable formats
+- **Consolidated function usage**: Removed duplicate `extract_artist_title` implementations across modules
+  - **Single source of truth**: All modules now import from `fuzzy_matcher.py`
+  - **Consistent parsing**: Eliminated inconsistencies between different parsing implementations
+  - **Better maintainability**: Changes to parsing logic only need to be made in one place
+
+### **Fixed Import Conflicts**
+- **Resolved import conflict in `download_planner.py`**: Updated to use the enhanced `extract_artist_title` from `fuzzy_matcher.py` instead of the simpler version from `id3_utils.py`
+- **Updated `id3_utils.py`**: Now imports `extract_artist_title` from `fuzzy_matcher.py` for consistency
+
+### **Enhanced --limit Parameter**
+- **Fixed limit application**: The `--limit` parameter now correctly applies to the scanning phase, not just the download execution
+- **Improved performance**: When using `--limit N`, only the first N songs are scanned against channels, significantly reducing processing time for large songlists
+
+### **Benefits of Recent Improvements**
+- **Better matching accuracy**: Enhanced fuzzy matching can now handle a wider variety of video title formats commonly found on YouTube karaoke channels
+- **Reduced false negatives**: Songs that previously couldn't be matched due to title format differences now have a higher chance of being found
+- **Consistent behavior**: All parts of the system use the same parsing logic, eliminating edge cases where different modules would parse the same title differently
+- **Improved performance**: The `--limit` parameter now works as expected, providing faster processing for targeted downloads
+- **Cleaner codebase**: Eliminated duplicate code and import conflicts, making the system more maintainable
+
+## 🔧 Recent Bug Fixes & Improvements (v3.4.2)
+### **Duplicate File Prevention & Filename Consistency**
+- **Enhanced file existence checking**: `check_file_exists_with_patterns()` now detects files with `(2)`, `(3)`, etc. suffixes that yt-dlp creates
+- **Automatic duplicate prevention**: Download pipeline skips downloads when files already exist (including duplicates)
+- **Updated yt-dlp configuration**: Set `"nooverwrites": false` to prevent yt-dlp from creating duplicate files with suffixes
+- **Cleanup utility**: `data/cleanup_duplicate_files.py` provides interactive cleanup of existing duplicate files
+- **Filename vs ID3 tag consistency**: Removed "(Karaoke Version)" suffix from ID3 tags to match filenames exactly
+- **Unified parsing**: Both filename generation and ID3 tagging use the same artist/title extraction logic
+
+### **Benefits of Duplicate Prevention**
+- **No more duplicate files**: Eliminates `(2)`, `(3)` suffix files that waste disk space
+- **Consistent metadata**: Filename and ID3 tag use identical artist/title format
+- **Efficient disk usage**: Prevents unnecessary downloads of existing files
+- **Clear file identification**: Consistent naming across all file operations
+
+## 🛠️ Maintenance
+
+### **Regular Cleanup**
+- Run the cleanup utility periodically to remove any duplicate files
+- Monitor downloads for any new duplicate creation (should be rare with fixes)
+
+### **Configuration**
+- Keep `"nooverwrites": false` in `data/config.json`
+- This prevents yt-dlp from creating duplicate files
+
+### **Monitoring**
+- Check logs for "⏭️ Skipping download - file already exists" messages
+- These indicate the duplicate prevention is working correctly
+
+## 🔧 Recent Bug Fixes & Improvements (v3.4.3)
+### **Manual Video Collection System**
+- **New `--manual` parameter**: Simple access to manual video collection via `python download_karaoke.py --manual --limit 5`
+- **Static video management**: `data/manual_videos.json` stores individual karaoke videos that don't belong to regular channels
+- **Helper script**: `add_manual_video.py` provides easy management of manual video entries
+- **Full integration**: Manual videos work with all existing features (songlist matching, fuzzy matching, parallel downloads, etc.)
+- **No yt-dlp dependency**: Manual videos bypass YouTube API calls for video listing, using static data instead
+
+### **Channel-Specific Parsing Rules**
+- **JSON-based configuration**: `data/channels.json` replaces `data/channels.txt` with structured channel configuration
+- **Parsing rules per channel**: Each channel can define custom parsing rules for video titles
+- **Multiple format support**: Handles various title formats like "Artist - Title", "Artist   Title", "Title | Artist", etc.
+- **Suffix cleanup**: Automatic removal of common karaoke-related suffixes
+- **Multi-artist support**: Parsing for titles with multiple artists separated by specific delimiters
+- **Backward compatibility**: Still supports legacy `data/channels.txt` format
+
+### **Benefits of New Features**
+- **Flexible video management**: Easy addition of individual karaoke videos without creating new channels
+- **Accurate parsing**: Channel-specific rules ensure correct artist/title extraction for ID3 tags and filenames
+- **Consistent metadata**: Proper parsing prevents filename and ID3 tag inconsistencies
+- **Easy maintenance**: Simple JSON structure for managing both channels and manual videos
+- **Full feature compatibility**: Manual videos work seamlessly with existing download modes and features
+
+## 📚 Documentation Standards
+
+### **Documentation Location**
+- **All changes, refactoring, and improvements should be documented in the PRD.md and README.md files**
+- **Do NOT create separate .md files for documenting changes, refactoring, or improvements**
+- **Use the existing sections in PRD.md and README.md to track all project evolution**
+
+### **Where to Document Changes**
+- **PRD.md**: Technical details, architecture changes, bug fixes, and implementation specifics
+- **README.md**: User-facing features, usage instructions, and high-level improvements
+- **CHANGELOG.md**: Version-specific release notes and change summaries
+
+### **Documentation Requirements**
+- **All new features must be documented in both PRD.md and README.md**
+- **All refactoring efforts must be documented in the appropriate sections**
+- **All bug fixes must be documented with technical details**
+- **Version numbers and dates should be clearly marked**
+- **Benefits and improvements should be explicitly stated**
+
+### **Maintenance Responsibility**
+- **Keep PRD.md and README.md synchronized with code changes**
+- **Update documentation immediately when implementing new features**
+- **Remove outdated information and consolidate related changes**
+- **Ensure all CLI options and features are documented in both files**
+
+## 🔧 Recent Bug Fixes & Improvements (v3.4.4)
+### **All Videos Download Mode**
+- **New `--all-videos` parameter**: Download all videos from a channel, not just songlist matches
+- **Smart MP3/MP4 detection**: Automatically detects if you have MP3 versions in songs.json and downloads MP4 video versions
+- **Existing file skipping**: Skips videos that already exist on the filesystem
+- **Progress tracking**: Shows clear progress with "Downloading X/Y videos" format
+- **Parallel processing support**: Works with `--parallel --workers N` for faster downloads
+- **Channel focus integration**: Works with `--channel-focus` to target specific channels
+- **Limit support**: Works with `--limit N` to control download batch size
+
+### **Smart Songlist Integration**
+- **MP4 version detection**: Checks if MP4 version already exists in songs.json before downloading
+- **MP3 upgrade path**: Downloads MP4 video versions when only MP3 versions exist in songlist
+- **Duplicate prevention**: Skips downloads when MP4 versions already exist
+- **Efficient filtering**: Only processes videos that need to be downloaded
+
+### **Benefits of All Videos Mode**
+- **Complete channel downloads**: Download entire channels without songlist restrictions
+- **Automatic format upgrading**: Upgrade MP3 collections to MP4 video versions
+- **Efficient processing**: Only downloads videos that don't already exist
+- **Flexible control**: Use with limits, parallel processing, and channel targeting
+- **Clear progress feedback**: Real-time progress tracking for large downloads
+
+## 🔧 Recent Bug Fixes & Improvements (v3.4.5)
+### **Unified Download Workflow Architecture**
+- **Unified execution pipeline**: All download modes now use the same execution workflow, eliminating inconsistencies and broken pipelines
+- **Consistent behavior**: All modes (--channel-focus, --all-videos, --songlist-only, --latest-per-channel) use identical download execution, progress tracking, and error handling
+- **Centralized download logic**: Single `execute_unified_download_workflow()` method handles all download execution
+- **Automatic parallel support**: All download modes automatically support `--parallel --workers N` without additional implementation
+- **Unified cache management**: Consistent progress tracking and resume functionality across all modes
+
+### **Architecture Pattern for New Download Modes**
+When adding new download modes in the future, follow this pattern to ensure consistency:
+
+#### **1. Download Plan Building (Mode-Specific)**
+Each download mode should build a download plan (list of videos to download) with this structure:
+```python
+download_plan = [
+    {
+        "video_id": "video_id",
+        "artist": "artist_name", 
+        "title": "song_title",
+        "filename": "sanitized_filename.mp4",
+        "channel_name": "channel_name",
+        "video_title": "original_video_title",
+        "force_download": False
+    }
+]
+```
+
+#### **2. Unified Execution (Shared)**
+All modes should use the unified execution workflow:
+```python
+downloaded_count, success = self.execute_unified_download_workflow(
+    download_plan=download_plan,
+    cache_file=cache_file,  # Optional, for progress tracking
+    limit=limit,            # Optional, for limiting downloads
+    show_progress=True,     # Optional, for progress display
+)
+```
+
+#### **3. Execution Method Selection (Automatic)**
+The unified workflow automatically chooses execution method based on settings:
+- **Sequential**: Uses `DownloadPipeline` for single-threaded downloads
+- **Parallel**: Uses `ParallelDownloader` when `--parallel` is enabled
+
+#### **4. Required Implementation Pattern**
+```python
+def download_new_mode(self, ...):
+    """New download mode implementation."""
+    
+    # 1. Build download plan (mode-specific logic)
+    download_plan = []
+    for video in videos_to_download:
+        download_plan.append({
+            "video_id": video["id"],
+            "artist": artist,
+            "title": title,
+            "filename": filename,
+            "channel_name": channel_name,
+            "video_title": video["title"],
+            "force_download": force_download
+        })
+    
+    # 2. Create cache file (optional, for progress tracking)
+    cache_file = get_download_plan_cache_file("new_mode", **plan_kwargs)
+    save_plan_cache(cache_file, download_plan, [])
+    
+    # 3. Use unified execution workflow
+    downloaded_count, success = self.execute_unified_download_workflow(
+        download_plan=download_plan,
+        cache_file=cache_file,
+        limit=limit,
+        show_progress=True,
+    )
+    
+    return success
+```
+
+### **Benefits of Unified Architecture**
+- **Consistency**: All modes behave identically for execution, progress tracking, and error handling
+- **Maintainability**: Changes to download execution only need to be made in one place
+- **Reliability**: Eliminates broken pipelines and inconsistent behavior between modes
+- **Extensibility**: New modes automatically get all existing features (parallel downloads, progress tracking, etc.)
+- **Testing**: Easier to test since all modes use the same execution logic
+
+### **What Was Fixed**
+- **Broken Pipeline**: Previously, different modes used different execution paths, leading to inconsistencies
+- **Missing Method**: Added missing `download_latest_per_channel()` method that was referenced in CLI but not implemented
+- **Code Duplication**: Eliminated duplicate download execution logic across different modes
+- **Inconsistent Behavior**: All modes now have identical progress tracking, error handling, and cache management
+
+### **Future Development Guidelines**
+1. **NEVER implement custom download execution logic** in new download modes
+2. **ALWAYS use `execute_unified_download_workflow()`** for download execution
+3. **Focus on download plan building** - that's where mode-specific logic belongs
+4. **Use the standard download plan structure** for consistency
+5. **Implement cache file handling** for progress tracking and resume functionality
+6. **Test with both sequential and parallel modes** to ensure compatibility
+
+---
+
+## 🚀 Future Enhancements
+- [ ] Web UI for easier management
+- [ ] More advanced song matching (multi-language)
+- [ ] Download scheduling and retry logic
+- [ ] More granular status reporting
+- [x] **Parallel downloads for improved speed** ✅ **COMPLETED**
+- [x] **Enhanced fuzzy matching with improved video title parsing** ✅ **COMPLETED**
+- [x] **Consolidated extract_artist_title function** ✅ **COMPLETED**
+- [x] **Duplicate file prevention and filename consistency** ✅ **COMPLETED**
+- [ ] Unit tests for all modules
+- [ ] Integration tests for end-to-end workflows
+- [ ] Plugin system for custom file operations
+- [ ] Advanced configuration UI
+- [ ] Real-time download progress visualization
+
+## 🔧 Recent Bug Fixes & Improvements (v3.4.4)
+### **macOS Support with Automatic Platform Detection**
+- **Cross-platform compatibility**: Added support for macOS alongside Windows
+- **Automatic platform detection**: Detects operating system and selects appropriate yt-dlp binary
+- **Flexible yt-dlp integration**: Supports both binary files (`yt-dlp_macos`) and pip installation (`python3 -m yt_dlp`)
+- **Setup automation**: `setup_macos.py` script for easy macOS setup with FFmpeg and yt-dlp installation
+- **Command parsing**: Intelligent parsing of yt-dlp commands (file paths vs. module commands)
+- **Enhanced validation**: Platform-specific error messages and validation in CLI
+- **Backward compatibility**: Maintains full compatibility with existing Windows installations
+
+### **Benefits of macOS Support**
+- **Native macOS experience**: No need for Windows compatibility layers or virtualization
+- **Automatic setup**: Simple setup script handles all dependencies
+- **Flexible installation**: Choose between binary download or pip installation
+- **Consistent functionality**: All features work identically on both platforms
+- **Easy maintenance**: Platform detection handles configuration automatically
+
+### **Setup Instructions**
+```bash
+# Automatic setup (recommended)
+python3 setup_macos.py
+
+# Test installation
+python3 src/tests/test_macos.py
+
+# Manual setup options
+# 1. Install yt-dlp via pip: pip3 install yt-dlp
+# 2. Download binary: curl -L -o downloader/yt-dlp_macos https://github.com/yt-dlp/yt-dlp/releases/latest/download/yt-dlp_macos
+# 3. Install FFmpeg: brew install ffmpeg
+```
+
+## 🔧 Recent Bug Fixes & Improvements (v3.4.7)
+### **Configurable Data Directory Path**
+- **Centralized Data Path Management**: New `data_path_manager.py` module provides unified data directory path management
+- **Configurable Location**: Data directory path can be set in `config/config.json` under `folder_structure.data_dir`
+- **Backward Compatibility**: Defaults to "data" directory if not configured
+- **Cross-Project Integration**: Enables the karaoke downloader to be used as a component in other projects with different data directory structures
+- **Updated All Modules**: All modules now use the data path manager instead of hardcoded "data/" paths
+- **Utility Functions**: Provides `get_data_path()`, `get_data_dir()`, and `get_data_path_manager()` functions for easy access
+- **Fixed Circular Dependency**: Moved `config.json` from `data/` to root directory to resolve chicken-and-egg problem
+
+### **Benefits of Configurable Data Directory**
+- **Flexible Deployment**: Can be integrated into other projects with different directory structures
+- **Centralized Configuration**: Single point of configuration for all data file paths
+- **Maintainable Code**: Eliminates hardcoded paths throughout the codebase
+- **Easy Testing**: Can use temporary directories for testing without affecting production data
+- **Future-Proof**: Makes it easier to change data directory structure in the future
+
+### **Circular Dependency Solution**
+The original implementation had a circular dependency problem:
+- **Problem**: `config.json` was located in the `data/` directory
+- **Issue**: To read the config file, we needed to know where the data directory is
+- **Conflict**: But the data directory location is specified in the config file
+- **Solution**: Moved `config.json` to the `config/` directory as a fixed location
+- **Result**: Config file is always accessible in a dedicated config directory, and data directory can be configured within it
+- **Backward Compatibility**: System still works with config files in custom data directories when explicitly specified
+
+## 🔧 Recent Bug Fixes & Improvements (v3.4.6)
+### **Dry Run Mode**
+- **New `--dry-run` parameter**: Build download plan and show what would be downloaded without actually downloading anything
+- **Plan preview**: Shows total videos in plan and preview of first 5 videos
+- **Safe testing**: Test download configurations without consuming bandwidth or disk space
+- **All mode support**: Works with all download modes (--channel-focus, --all-videos, --songlist-only, --latest-per-channel)
+- **Progress simulation**: Shows what the download process would look like without executing it
+
+### **Benefits of Dry Run Mode**
+- **Safe testing**: Test complex download configurations without downloading anything
+- **Plan validation**: Verify that the download plan contains the expected videos
+- **Configuration debugging**: Troubleshoot download settings before committing to downloads
+- **Resource conservation**: Save bandwidth and disk space during testing
+- **User education**: Help users understand what the tool will do before running it
+
+### **Example Usage**
+```bash
+# Test songlist download plan
+python download_karaoke.py --songlist-only --limit 5 --dry-run
+
+# Test channel download plan
+python download_karaoke.py --channel-focus SingKingKaraoke --all-videos --limit 10 --dry-run
+
+# Test with fuzzy matching
+python download_karaoke.py --songlist-only --fuzzy-match --limit 3 --dry-run
+```
+
+### **Future Development Guidelines**
--- a/README.md
+++ b/README.md
@ -1,6 +1,6 @@
 # 🎤 Karaoke Video Downloader

-A Python-based Windows CLI tool to download karaoke videos from YouTube channels/playlists using `yt-dlp.exe`, with advanced tracking, songlist prioritization, and flexible configuration.
+A Python-based cross-platform CLI tool to download karaoke videos from YouTube channels/playlists using `yt-dlp`, with advanced tracking, songlist prioritization, and flexible configuration. Supports Windows and macOS with automatic platform detection.

 ## ✨ Features
 - 🎵 **Channel & Playlist Downloads**: Download all videos from a YouTube channel or playlist
@ -13,7 +13,7 @@ A Python-based Windows CLI tool to download karaoke videos from YouTube channels
 - 📈 **Real-Time Progress**: Detailed console and log output
 - 🧹 **Reset/Clear Channel**: Reset all tracking and files for a channel, or clear channel cache via CLI
 - 🗂️ **Latest-per-channel download**: Download the latest N videos from each channel in a single batch, with server deduplication, fuzzy matching support, per-channel download plan, robust resume, and unique plan cache. Use --latest-per-channel and --limit N.
- 🧩 **Fuzzy Matching**: Optionally use fuzzy string matching for songlist-to-video matching (with --fuzzy-match, requires rapidfuzz for best results)
+- 🧩 **Enhanced Fuzzy Matching**: Advanced fuzzy string matching for songlist-to-video matching with improved video title parsing (handles multiple title formats like "Title Karaoke | Artist Karaoke Version")
 - ⚡ **Fast Mode with Early Exit**: When a limit is set, scans channels and songs in order, downloads immediately when a match is found, and stops as soon as the limit is reached with successful downloads
 - 🔄 **Deduplication Across Channels**: Ensures the same song is not downloaded from multiple channels, even if it appears in more than one channel's video list
 - 📋 **Default Channel File**: Automatically uses data/channels.txt as the default channel list for songlist modes (no need to specify --file every time)
@ -21,10 +21,20 @@ A Python-based Windows CLI tool to download karaoke videos from YouTube channels
 - ⚡ **Optimized Scanning**: High-performance channel scanning with O(n×m) complexity, pre-processed lookups, and early termination for faster matching
 - 🏷️ **Server Duplicates Tracking**: Automatically checks against local songs.json file and marks duplicates for future skipping, preventing re-downloads of songs already on the server
 - ⚡ **Parallel Downloads**: Enable concurrent downloads with `--parallel --workers N` for significantly faster batch downloads (3-5x speedup)
+- 📊 **Unmatched Songs Reports**: Generate detailed reports of songs that couldn't be found in any channel with `--generate-unmatched-report`
+- 🛡️ **Duplicate File Prevention**: Automatically detects and prevents duplicate files with `(2)`, `(3)` suffixes, with cleanup utility for existing duplicates
+- 🏷️ **Consistent Metadata**: Filename and ID3 tag use identical artist/title format for clear file identification
+- 🍎 **macOS Support**: Automatic platform detection and setup with native macOS binaries and FFmpeg integration

 ## 🏗️ Architecture
 The codebase has been comprehensively refactored into a modular architecture with centralized utilities for improved maintainability, error handling, and code reuse:

+### **Configurable Data Directory (v3.4.7)**
+- **Centralized Data Path Management**: `data_path_manager.py` provides unified data directory path management
+- **Configurable Location**: Data directory path can be set in `config/config.json` under `folder_structure.data_dir`
+- **Backward Compatibility**: Defaults to "data" directory if not configured
+- **Cross-Project Integration**: Enables the karaoke downloader to be used as a component in other projects with different data directory structures
+
 ### Core Modules:
 - **`downloader.py`**: Main orchestrator and CLI interface
 - **`video_downloader.py`**: Core video download execution and orchestration
@ -46,47 +56,192 @@ The codebase has been comprehensively refactored into a modular architecture wit
 - **`tracking_cli.py`**: Tracking management CLI

 ### New Utility Modules (v3.3):
- **`parallel_downloader.py`**: Parallel download management with thread-safe operations
-  - `ParallelDownloader` class: Manages concurrent downloads with configurable workers
-  - `DownloadTask` and `DownloadResult` dataclasses: Structured task and result management
-  - Thread-safe progress tracking and error handling
-  - Automatic retry mechanism for failed downloads
 - **`file_utils.py`**: Centralized file operations, filename sanitization, and file validation
-  - `sanitize_filename()`: Create safe filenames from artist/title
-  - `generate_possible_filenames()`: Generate filename patterns for different modes
-  - `check_file_exists_with_patterns()`: Check for existing files using multiple patterns
-  - `is_valid_mp4_file()`: Validate MP4 files with header checking
-  - `cleanup_temp_files()`: Remove temporary yt-dlp files
-  - `ensure_directory_exists()`: Safe directory creation
+- **`song_validator.py`**: Centralized song validation logic for checking if songs should be downloaded

- **`song_validator.py`**: Centralized song validation logic
-  - `SongValidator` class: Unified logic for checking if songs should be downloaded
-  - `should_skip_song()`: Comprehensive validation with multiple criteria
-  - `mark_song_failed()`: Consistent failure tracking
-  - `handle_download_failure()`: Standardized error handling
+### New Utility Modules (v3.4.7):
+- **`data_path_manager.py`**: Centralized data directory path management and file path resolution

- **Enhanced `config_manager.py`**: Robust configuration management with dataclasses
-  - `ConfigManager` class: Type-safe configuration loading and caching
-  - `DownloadSettings`, `FolderStructure`, `LoggingConfig` dataclasses
-  - Configuration validation and merging with defaults
-  - Dynamic resolution updates
+### **Unified Download Workflow (v3.4.5)**
+- **`execute_unified_download_workflow()`**: Centralized download execution that all modes use
+- **`_execute_sequential_downloads()`**: Sequential download execution using DownloadPipeline
+- **`_execute_parallel_downloads()`**: Parallel download execution using ParallelDownloader

-### Benefits:
+### **Benefits of Enhanced Modular Architecture:**
+- **Single Responsibility**: Each module has a focused purpose
 - **Centralized Utilities**: Common operations (file operations, song validation, yt-dlp commands, error handling) are centralized
 - **Reduced Duplication**: Eliminated ~150 lines of code duplication across modules
+- **Testability**: Individual components can be tested separately
+- **Maintainability**: Easier to find and fix issues
+- **Reusability**: Components can be used independently
+- **Robustness**: Better error handling and interruption recovery
 - **Consistency**: Standardized error messages and processing pipelines
- **Maintainability**: Changes isolated to specific modules
- **Testability**: Modular components can be tested independently
 - **Type Safety**: Comprehensive type hints across all new modules
+- **Unified Execution**: All download modes use the same execution pipeline for consistency
+
+## 🔧 Development Guidelines
+
+### **Adding New Download Modes**
+When adding new download modes, follow the unified workflow pattern to ensure consistency:
+
+#### **1. Build Download Plan (Mode-Specific)**
+```python
+def download_new_mode(self, ...):
+    # Build download plan with standard structure
+    download_plan = []
+    for video in videos_to_download:
+        download_plan.append({
+            "video_id": video["id"],
+            "artist": artist,
+            "title": title,
+            "filename": filename,
+            "channel_name": channel_name,
+            "video_title": video["title"],
+            "force_download": force_download
+        })
+    
+    # Use unified execution workflow
+    downloaded_count, success = self.execute_unified_download_workflow(
+        download_plan=download_plan,
+        cache_file=cache_file,
+        limit=limit,
+        show_progress=True,
+    )
+    
+    return success
+```
+
+#### **2. Key Principles**
+- **NEVER implement custom download execution logic** - always use `execute_unified_download_workflow()`
+- **Focus on download plan building** - that's where mode-specific logic belongs
+- **Use the standard download plan structure** for consistency
+- **Implement cache file handling** for progress tracking and resume functionality
+- **Test with both sequential and parallel modes** to ensure compatibility
+
+#### **3. Benefits of Unified Architecture**
+- **Consistency**: All modes behave identically for execution, progress tracking, and error handling
+- **Automatic Features**: New modes automatically get parallel downloads, progress tracking, and cache management
+- **Maintainability**: Changes to download execution only need to be made in one place
+- **Reliability**: Eliminates broken pipelines and inconsistent behavior between modes
+
+## 🔧 Recent Improvements (v3.4.1)
+### **Enhanced Fuzzy Matching**
+- **Improved title parsing**: Enhanced `extract_artist_title` function to handle multiple video title formats
+- **Better matching accuracy**: Can now parse titles like "Hold On Loosely Karaoke | 38 Special Karaoke Version"
+- **Consistent parsing**: All modules now use the same parsing logic from `fuzzy_matcher.py`
+- **Reduced false negatives**: Songs that previously couldn't be matched due to title format differences now have a higher chance of being found
+
+### **Fixed Import Conflicts**
+- **Resolved import conflicts**: Updated modules to use the enhanced `extract_artist_title` from `fuzzy_matcher.py`
+- **Consistent behavior**: All parts of the system use the same parsing logic
+- **Cleaner codebase**: Eliminated duplicate code and import conflicts
+
+### **Fixed --limit Parameter**
+- **Correct limit application**: The `--limit` parameter now properly limits the scanning phase, not just downloads
+- **Improved performance**: When using `--limit N`, only the first N songs are scanned, significantly reducing processing time
+- **Accurate logging**: Logging messages now show the correct counts for songs that will actually be processed when using `--limit`
+
+### **Code Quality Improvements**
+- **Eliminated duplicate functions**: Removed duplicate `extract_artist_title` implementations
+- **Fixed import conflicts**: Resolved inconsistencies between different parsing implementations
+- **Single source of truth**: All title parsing logic is now centralized in `fuzzy_matcher.py`
+
+## 🔧 Recent Improvements (v3.4.5)
+### **Unified Download Workflow Architecture**
+- **Unified execution pipeline**: All download modes now use the same execution workflow, eliminating inconsistencies and broken pipelines
+- **Consistent behavior**: All modes (--channel-focus, --all-videos, --songlist-only, --latest-per-channel) use identical download execution, progress tracking, and error handling
+- **Centralized download logic**: Single `execute_unified_download_workflow()` method handles all download execution
+- **Automatic parallel support**: All download modes automatically support `--parallel --workers N` without additional implementation
+- **Unified cache management**: Consistent progress tracking and resume functionality across all modes
+
+### **What Was Fixed**
+- **Broken Pipeline**: Previously, different modes used different execution paths, leading to inconsistencies
+- **Missing Method**: Added missing `download_latest_per_channel()` method that was referenced in CLI but not implemented
+- **Code Duplication**: Eliminated duplicate download execution logic across different modes
+- **Inconsistent Behavior**: All modes now have identical progress tracking, error handling, and cache management
+
+### **Benefits**
+- ✅ **Consistency**: All modes behave identically for execution, progress tracking, and error handling
+- ✅ **Maintainability**: Changes to download execution only need to be made in one place
+- ✅ **Reliability**: Eliminates broken pipelines and inconsistent behavior between modes
+- ✅ **Extensibility**: New modes automatically get all existing features (parallel downloads, progress tracking, etc.)
+- ✅ **Testing**: Easier to test since all modes use the same execution logic
+
+## 🛡️ Duplicate File Prevention & Filename Consistency (v3.4.2)
+### **Duplicate File Prevention**
+- **Enhanced file existence checking**: Now detects files with `(2)`, `(3)`, etc. suffixes that yt-dlp creates
+- **Automatic duplicate prevention**: Skips downloads when files already exist (including duplicates)
+- **Updated yt-dlp configuration**: Set `"nooverwrites": false` to prevent yt-dlp from creating duplicate files
+- **Cleanup utility**: `data/cleanup_duplicate_files.py` helps identify and remove existing duplicate files
+
+### **Filename vs ID3 Tag Consistency**
+- **Consistent metadata**: Filename and ID3 tag now use identical artist/title format
+- **Removed extra suffixes**: No more "(Karaoke Version)" in ID3 tags that don't match filenames
+- **Unified parsing**: Both filename generation and ID3 tagging use the same artist/title extraction
+
+### **Benefits**
+- ✅ **No more duplicate files** with `(2)`, `(3)` suffixes
+- ✅ **Consistent metadata** between filename and ID3 tags
+- ✅ **Efficient disk usage** by preventing unnecessary downloads
+- ✅ **Clear file identification** with consistent naming
+
+### **Clean Up Existing Duplicates**
+```bash
+# Run the cleanup utility to find and remove existing duplicates
+python data/cleanup_duplicate_files.py
+
+# Choose option 1 for dry run (recommended first)
+# Choose option 2 to actually delete duplicates
+```

 ## 📋 Requirements
- **Windows 10/11**
+- **Windows 10/11 or macOS 10.14+**
 - **Python 3.7+**
- **yt-dlp.exe** (in `downloader/`)
+- **yt-dlp binary** (platform-specific, see setup instructions below)
 - **mutagen** (for ID3 tagging, optional)
 - **ffmpeg/ffprobe** (for video validation, optional but recommended)
 - **rapidfuzz** (for fuzzy matching, optional, falls back to difflib)

+## 🍎 macOS Setup
+
+### Automatic Setup (Recommended)
+Run the macOS setup script to automatically set up yt-dlp and FFmpeg:
+
+```bash
+python3 setup_macos.py
+```
+
+This script will:
+- Detect your macOS version
+- Offer installation options for yt-dlp (pip or binary download)
+- Install FFmpeg via Homebrew
+- Test the installation
+
+### Manual Setup
+If you prefer to set up manually:
+
+#### Option 1: Install yt-dlp via pip
+```bash
+pip3 install yt-dlp
+```
+
+#### Option 2: Download yt-dlp binary
+```bash
+mkdir -p downloader
+curl -L -o downloader/yt-dlp_macos https://github.com/yt-dlp/yt-dlp/releases/latest/download/yt-dlp_macos
+chmod +x downloader/yt-dlp_macos
+```
+
+#### Install FFmpeg
+```bash
+brew install ffmpeg
+```
+
+### Test Installation
+```bash
+python3 src/tests/test_macos.py
+```
+
 ## 🚀 Quick Start

 > **💡 Pro Tip**: For a complete list of all available commands, see `commands.txt` - you can copy/paste any command directly into your terminal!
@ -96,6 +251,21 @@ The codebase has been comprehensively refactored into a modular architecture wit
 python download_karaoke.py https://www.youtube.com/@SingKingKaraoke/videos
 ```

+### Download ALL Videos from a Channel (Not Just Songlist Matches)
+```bash
+python download_karaoke.py --channel-focus SingKingKaraoke --all-videos
+```
+
+### Download ALL Videos with Parallel Processing
+```bash
+python download_karaoke.py --channel-focus SingKingKaraoke --all-videos --parallel --workers 10
+```
+
+### Download ALL Videos with Limit
+```bash
+python download_karaoke.py --channel-focus SingKingKaraoke --all-videos --limit 100
+```
+
 ### Download Only Songlist Songs (Fast Mode)
 ```bash
 python download_karaoke.py --songlist-only --limit 5
@ -103,7 +273,7 @@ python download_karaoke.py --songlist-only --limit 5

 ### Download with Parallel Processing
 ```bash
-python download_karaoke.py --parallel --workers 5 --songlist-only --limit 10
+python download_karaoke.py --parallel --songlist-only --limit 10
 ```

 ### Focus on Specific Playlists by Title
@ -111,11 +281,31 @@ python download_karaoke.py --parallel --workers 5 --songlist-only --limit 10
 python download_karaoke.py --songlist-focus "2025 - Apple Top 50" "2024 - Billboard Hot 100"
 ```

+### Focus on Specific Playlists from Custom File
+```bash
+python download_karaoke.py --songlist-focus "CCKaraoke" --songlist-file "data/my_custom_songlist.json"
+```
+
+### Force Download from Channels (Bypass All Existing File Checks)
+```bash
+python download_karaoke.py --songlist-focus "2025 - Apple Top 50" --force
+```
+
 ### Download with Fuzzy Matching
 ```bash
 python download_karaoke.py --songlist-only --limit 10 --fuzzy-match --fuzzy-threshold 85
 ```

+### Test Download Plan (Dry Run)
+```bash
+python download_karaoke.py --songlist-only --limit 5 --dry-run
+```
+
+### Test Channel Download Plan (Dry Run)
+```bash
+python download_karaoke.py --channel-focus SingKingKaraoke --all-videos --limit 10 --dry-run
+```
+
 ### Download Latest N Videos Per Channel
 ```bash
 python download_karaoke.py --latest-per-channel --limit 5
@ -220,19 +410,33 @@ KaroakeVideoDownloader/
 │   ├── check_resolution.py     # Resolution checker utility
 │   ├── resolution_cli.py       # Resolution config CLI
 │   └── tracking_cli.py         # Tracking management CLI
-├── data/                      # All config, tracking, cache, and songlist files
-│   ├── config.json
+├── config/                   # Configuration files
+│   └── config.json          # Main configuration file
+├── data/                     # All tracking, cache, and songlist files
 │   ├── karaoke_tracking.json
 │   ├── songlist_tracking.json
 │   ├── channel_cache.json
-│   ├── channels.txt
+│   ├── channels.json          # Channel configuration with parsing rules
 │   └── songList.json
+├── utilities/                # Utility scripts and tools
+│   ├── add_manual_video.py  # Manual video management
+│   ├── build_cache_from_raw.py # Cache building utility
+│   ├── cleanup_duplicate_files.py # File cleanup utilities
+│   ├── cleanup_recent_tracking.py # Tracking cleanup utilities
+│   ├── deduplicate_songlist_tracking.py # Data deduplication
+│   ├── fix_artist_name_format.py # Data cleanup utilities
+│   ├── fix_artist_name_format_simple.py
+│   ├── fix_code_quality.py  # Development tools
+│   ├── reset_and_redownload.py # Maintenance utilities
+│   └── songlist_report.py   # Reporting utilities
 ├── downloads/                 # All video output
 │   └── [ChannelName]/         # Per-channel folders
 ├── logs/                      # Download logs
-├── downloader/yt-dlp.exe      # yt-dlp binary
-├── tests/                     # Diagnostic and test scripts
-│   └── test_installation.py
+├── downloader/yt-dlp.exe      # yt-dlp binary (Windows)
+├── downloader/yt-dlp_macos    # yt-dlp binary (macOS)
+├── src/tests/                 # Test scripts
+│   ├── test_macos.py         # macOS setup and functionality tests
+│   └── test_platform.py      # Platform detection tests
 ├── download_karaoke.py        # Main entry point (thin wrapper)
 ├── README.md
 ├── PRD.md
@ -249,6 +453,7 @@ KaroakeVideoDownloader/
 - `--songlist-priority`: Prioritize songlist songs in download queue
 - `--songlist-only`: Download only songs from the songlist
 - `--songlist-focus <PLAYLIST_TITLE1> <PLAYLIST_TITLE2>...`: Focus on specific playlists by title (e.g., `--songlist-focus "2025 - Apple Top 50" "2024 - Billboard Hot 100"`)
+- `--songlist-file <FILE_PATH>`: Custom songlist file path to use with --songlist-focus (default: data/songList.json)
 - `--songlist-status`: Show songlist download progress
 - `--limit <N>`: Limit number of downloads (enables fast mode with early exit)
 - `--resolution <720p|1080p|...>`: Override resolution
@ -260,8 +465,14 @@ KaroakeVideoDownloader/
 - `--latest-per-channel`: **Download the latest N videos from each channel (use with --limit)**
 - `--fuzzy-match`: Enable fuzzy matching for songlist-to-video matching (uses rapidfuzz if available)
 - `--fuzzy-threshold <N>`: Fuzzy match threshold (0-100, default 85)
- `--parallel`: Enable parallel downloads for improved speed
- `--workers <N>`: Number of parallel download workers (1-10, default: 3)
+- `--parallel`: Enable parallel downloads for improved speed (defaults to 3 workers)
+- `--workers <N>`: Number of parallel download workers (1-10, default: 3, only used with --parallel)
+- `--generate-songlist <DIR1> <DIR2>...`: **Generate song list from MP4 files with ID3 tags in specified directories**
+- `--no-append-songlist`: **Create a new song list instead of appending when using --generate-songlist**
+- `--force`: **Force download from channels, bypassing all existing file checks and re-downloading if necessary**
+- `--channel-focus <CHANNEL_NAME>`: **Download from a specific channel by name (e.g., 'SingKingKaraoke')**
+- `--all-videos`: **Download all videos from channel (not just songlist matches), skipping existing files**
+- `--dry-run`: **Build download plan and show what would be downloaded without actually downloading anything**

 ## 📝 Example Usage

@ -272,30 +483,61 @@ KaroakeVideoDownloader/
 python download_karaoke.py --songlist-only --limit 10 --fuzzy-match --fuzzy-threshold 85

 # Parallel downloads for faster processing
-python download_karaoke.py --parallel --workers 5 --songlist-only --limit 10
+python download_karaoke.py --parallel --songlist-only --limit 10

 # Latest videos per channel with parallel downloads
-python download_karaoke.py --parallel --workers 3 --latest-per-channel --limit 5
+python download_karaoke.py --parallel --latest-per-channel --limit 5

 # Traditional full scan (no limit)
 python download_karaoke.py --songlist-only

+# Focused fuzzy matching (target specific playlists with flexible matching)
+python download_karaoke.py --songlist-focus "2025 - Apple Top 50" --fuzzy-match --fuzzy-threshold 80 --limit 10
+
+# Focus on specific playlists from a custom file
+python download_karaoke.py --songlist-focus "CCKaraoke" --songlist-file "data/my_custom_songlist.json" --limit 10
+
+# Force download with fuzzy matching (bypass all existing file checks)
+python download_karaoke.py --songlist-focus "2025 - Apple Top 50" --force --fuzzy-match --fuzzy-threshold 80 --limit 10
+
 # Channel-specific operations
 python download_karaoke.py --reset-channel SingKingKaraoke
 python download_karaoke.py --reset-channel SingKingKaraoke --reset-songlist
 python download_karaoke.py --clear-cache all
 python download_karaoke.py --clear-server-duplicates
+
+# Download ALL videos from a specific channel
+python download_karaoke.py --channel-focus SingKingKaraoke --all-videos
+python download_karaoke.py --channel-focus SingKingKaraoke --all-videos --parallel --workers 10
+python download_karaoke.py --channel-focus SingKingKaraoke --all-videos --limit 100
+
+# Song list generation from MP4 files
+python download_karaoke.py --generate-songlist /path/to/mp4/directory
+python download_karaoke.py --generate-songlist /path/to/dir1 /path/to/dir2 --no-append-songlist
+
+# Generate report of songs that couldn't be found
+python download_karaoke.py --generate-unmatched-report
+python download_karaoke.py --generate-unmatched-report --fuzzy-match --fuzzy-threshold 85
 ```

 ## 🏷️ ID3 Tagging
 - Adds artist/title/album/genre to MP4 files using mutagen (if installed)

+## 📋 Song List Generation
+- **Generate song lists from existing MP4 files**: Use `--generate-songlist` to create song lists from directories containing MP4 files with ID3 tags
+- **Automatic ID3 extraction**: Extracts artist and title from MP4 files' ID3 tags
+- **Directory-based organization**: Each directory becomes a playlist with the directory name as the title
+- **Position tracking**: Songs are numbered starting from 1 based on file order
+- **Append or replace**: Choose to append to existing song list or create a new one with `--no-append-songlist`
+- **Multiple directories**: Process multiple directories in a single command
+
 ## 🧹 Cleanup
 - Removes `.info.json` and `.meta` files after download

 ## 🛠️ Configuration
- All options are in `data/config.json` (format, resolution, metadata, etc.)
+- All options are in `config/config.json` (format, resolution, metadata, etc.)
 - You can edit this file or use CLI flags to override
+- **Configurable Data Directory**: The data directory path can be configured in `config/config.json` under `folder_structure.data_dir` (default: "data")

 ## 📋 Command Reference File

@ -311,6 +553,31 @@ python download_karaoke.py --clear-server-duplicates

 > **🔄 Maintenance Note**: The `commands.txt` file should be kept up to date with any CLI changes. When adding new command-line options or modifying existing ones, update this file to reflect all available commands and their usage.

+## 📚 Documentation Standards
+
+### **Documentation Location**
+- **All changes, refactoring, and improvements should be documented in the PRD.md and README.md files**
+- **Do NOT create separate .md files for documenting changes, refactoring, or improvements**
+- **Use the existing sections in PRD.md and README.md to track all project evolution**
+
+### **Where to Document Changes**
+- **PRD.md**: Technical details, architecture changes, bug fixes, and implementation specifics
+- **README.md**: User-facing features, usage instructions, and high-level improvements
+- **CHANGELOG.md**: Version-specific release notes and change summaries
+
+### **Documentation Requirements**
+- **All new features must be documented in both PRD.md and README.md**
+- **All refactoring efforts must be documented in the appropriate sections**
+- **All bug fixes must be documented with technical details**
+- **Version numbers and dates should be clearly marked**
+- **Benefits and improvements should be explicitly stated**
+
+### **Maintenance Responsibility**
+- **Keep PRD.md and README.md synchronized with code changes**
+- **Update documentation immediately when implementing new features**
+- **Remove outdated information and consolidate related changes**
+- **Ensure all CLI options and features are documented in both files**
+
 ## 🔧 Refactoring Improvements (v3.3)
 The codebase has been comprehensively refactored to improve maintainability and reduce code duplication. Recent improvements have enhanced reliability, performance, and code organization:

@ -348,7 +615,7 @@ The codebase has been comprehensively refactored to improve maintainability and

 ### **New Parallel Download System (v3.4)**
 - **Parallel downloader module:** `parallel_downloader.py` provides thread-safe concurrent download management
- **Configurable concurrency:** Use `--parallel --workers N` to enable parallel downloads with N workers (1-10)
+- **Configurable concurrency:** Use `--parallel` to enable parallel downloads with 3 workers by default, or `--parallel --workers N` for custom worker count (1-10)
 - **Thread-safe operations:** All tracking, caching, and progress operations are thread-safe
 - **Real-time progress tracking:** Shows active downloads, completion status, and overall progress
 - **Automatic retry mechanism:** Failed downloads are automatically retried with reduced concurrency
@ -372,7 +639,8 @@ The codebase has been comprehensively refactored to improve maintainability and
 - **Robust download plan execution:** Fixed index management in download plan execution to prevent errors during interrupted downloads.

 ## 🐞 Troubleshooting
- Ensure `yt-dlp.exe` is in the `downloader/` folder
+- **Windows**: Ensure `yt-dlp.exe` is in the `downloader/` folder
+- **macOS**: Run `python3 setup_macos.py` to set up yt-dlp and FFmpeg
 - Check `logs/` for error details
 - Use `python -m karaoke_downloader.check_resolution` to verify video quality
 - If you see errors about ffmpeg/ffprobe, install [ffmpeg](https://ffmpeg.org/download.html) and ensure it is in your PATH
--- a/commands.txt
+++ b/commands.txt
@ -1,6 +1,6 @@
 # 🎤 Karaoke Video Downloader - CLI Commands Reference
 # Copy and paste these commands into your terminal
-# Updated: v3.4 (includes parallel downloads and all refactoring improvements)
+# Updated: v3.4.4 (includes macOS support, all videos download mode, manual video collection, channel parsing rules, and all previous improvements)

 ## 📥 BASIC DOWNLOADS

@ -8,7 +8,7 @@
 python download_karaoke.py https://www.youtube.com/@SingKingKaraoke/videos

 # Download from a file containing multiple channel URLs
-python download_karaoke.py --file data/channels.txt
+python download_karaoke.py --file data/channels.json

 # Download with custom resolution (480p, 720p, 1080p, 1440p, 2160p)
 python download_karaoke.py --resolution 1080p https://www.youtube.com/@SingKingKaraoke/videos
@ -19,9 +19,69 @@ python download_karaoke.py --limit 10 https://www.youtube.com/@SingKingKaraoke/v
 # Enable parallel downloads for faster processing (3-5x speedup)
 python download_karaoke.py --parallel --workers 5 --limit 10 https://www.youtube.com/@SingKingKaraoke/videos

+## 🎤 MANUAL VIDEO COLLECTION (v3.4.3)
+
+# Download from manual videos collection (data/manual_videos.json)
+python download_karaoke.py --manual --limit 5
+
+# Download manual videos with fuzzy matching
+python download_karaoke.py --manual --fuzzy-match --fuzzy-threshold 85 --limit 10
+
+# Download manual videos with parallel processing
+python download_karaoke.py --parallel --workers 3 --manual --limit 5
+
+# Download manual videos with songlist matching
+python download_karaoke.py --manual --songlist-only --limit 10
+
+# Force download from manual videos (bypass existing file checks)
+python download_karaoke.py --manual --force --limit 5
+
+# Add a video to manual collection (interactive)
+python utilities/add_manual_video.py add "Artist - Song Title (Karaoke Version)" "https://www.youtube.com/watch?v=VIDEO_ID"
+
+# List all manual videos
+python utilities/add_manual_video.py list
+
+# Remove a video from manual collection
+python utilities/add_manual_video.py remove "Artist - Song Title (Karaoke Version)"
+
+## 🎬 ALL VIDEOS DOWNLOAD MODE (v3.4.4)
+
+# Download ALL videos from a specific channel (not just songlist matches)
+python download_karaoke.py --channel-focus SingKingKaraoke --all-videos
+
+# Download ALL videos with parallel processing for speed
+python download_karaoke.py --channel-focus SingKingKaraoke --all-videos --parallel --workers 10
+
+# Download ALL videos with limit (download first N videos)
+python download_karaoke.py --channel-focus SingKingKaraoke --all-videos --limit 100
+
+# Download ALL videos with parallel processing and limit
+python download_karaoke.py --channel-focus SingKingKaraoke --all-videos --parallel --workers 5 --limit 50
+
+# Download ALL videos from ZoomKaraokeOfficial channel
+python download_karaoke.py --channel-focus ZoomKaraokeOfficial --all-videos
+
+# Download ALL videos with custom resolution
+python download_karaoke.py --channel-focus SingKingKaraoke --all-videos --resolution 1080p
+
+## 📋 SONG LIST GENERATION
+
+# Generate song list from MP4 files in a directory (append to existing song list)
+python download_karaoke.py --generate-songlist /path/to/mp4/directory
+
+# Generate song list from multiple directories
+python download_karaoke.py --generate-songlist /path/to/dir1 /path/to/dir2 /path/to/dir3
+
+# Generate song list and create a new song list file (don't append)
+python download_karaoke.py --generate-songlist /path/to/mp4/directory --no-append-songlist
+
+# Generate song list from multiple directories and create new file
+python download_karaoke.py --generate-songlist /path/to/dir1 /path/to/dir2 --no-append-songlist
+
 ## 🎵 SONGLIST OPERATIONS

-# Download only songs from your songlist (uses data/channels.txt by default)
+# Download only songs from your songlist (uses data/channels.json by default)
 python download_karaoke.py --songlist-only

 # Download only songlist songs with limit
@ -51,6 +111,18 @@ python download_karaoke.py --songlist-focus "2025 - Apple Top 50" --limit 5
 # Focus on specific playlists with parallel processing
 python download_karaoke.py --parallel --workers 3 --songlist-focus "2025 - Apple Top 50" --limit 5

+# Focus on specific playlists from a custom songlist file
+python download_karaoke.py --songlist-focus "CCKaraoke" --songlist-file "data/my_custom_songlist.json"
+
+# Focus on specific playlists from a custom file with force mode
+python download_karaoke.py --songlist-focus "CCKaraoke" --songlist-file "data/my_custom_songlist.json" --force
+
+# Force download from channels regardless of existing files or server duplicates
+python download_karaoke.py --songlist-focus "2025 - Apple Top 50" --force
+
+# Force download with parallel processing
+python download_karaoke.py --parallel --workers 5 --songlist-focus "2025 - Apple Top 50" --force --limit 10
+
 # Prioritize songlist songs in download queue (default behavior)
 python download_karaoke.py --songlist-priority https://www.youtube.com/@SingKingKaraoke/videos

@ -60,6 +132,35 @@ python download_karaoke.py --no-songlist-priority https://www.youtube.com/@SingK
 # Show songlist download status and statistics
 python download_karaoke.py --songlist-status

+## 📊 UNMATCHED SONGS REPORTS
+
+# Generate report of songs that couldn't be found in any channel (standalone)
+python download_karaoke.py --generate-unmatched-report
+
+# Generate report with fuzzy matching enabled (standalone)
+python download_karaoke.py --generate-unmatched-report --fuzzy-match --fuzzy-threshold 85
+
+# Generate report using a specific channel file (standalone)
+python download_karaoke.py --generate-unmatched-report --file data/my_channels.txt
+
+# Generate report from a custom songlist file (standalone)
+python download_karaoke.py --generate-unmatched-report --songlist-file "data/my_custom_songlist.json"
+
+# Generate report with focus on specific playlists from a custom file (standalone)
+python download_karaoke.py --songlist-focus "CCKaraoke" --songlist-file "data/my_custom_songlist.json" --generate-unmatched-report
+
+# Download songs AND generate unmatched report (additive feature)
+python download_karaoke.py --songlist-only --limit 10 --generate-unmatched-report
+
+# Download with fuzzy matching AND generate unmatched report
+python download_karaoke.py --songlist-only --fuzzy-match --fuzzy-threshold 85 --limit 10 --generate-unmatched-report
+
+# Download from specific playlists AND generate unmatched report
+python download_karaoke.py --songlist-focus "CCKaraoke" --limit 10 --generate-unmatched-report
+
+# Generate report with custom fuzzy threshold
+python download_karaoke.py --generate-unmatched-report --fuzzy-match --fuzzy-threshold 80
+
 ## ⚡ PARALLEL DOWNLOADS (v3.4)

 # Basic parallel downloads (3-5x faster than sequential)
@ -94,7 +195,7 @@ python download_karaoke.py --parallel --workers 3 --latest-per-channel --limit 5
 python download_karaoke.py --parallel --workers 3 --latest-per-channel --limit 5 --fuzzy-match --fuzzy-threshold 85

 # Download latest videos from specific channels file
-python download_karaoke.py --latest-per-channel --limit 5 --file data/channels.txt
+python download_karaoke.py --latest-per-channel --limit 5 --file data/channels.json

 ## 🔄 CACHE & TRACKING MANAGEMENT

@ -153,7 +254,7 @@ python download_karaoke.py --version
 python download_karaoke.py --songlist-only --limit 20 --fuzzy-match --fuzzy-threshold 85 --resolution 1080p

 # Latest videos per channel with fuzzy matching
-python download_karaoke.py --latest-per-channel --limit 3 --fuzzy-match --fuzzy-threshold 90 --file data/channels.txt
+python download_karaoke.py --latest-per-channel --limit 3 --fuzzy-match --fuzzy-threshold 90 --file data/channels.json

 # Force refresh everything and download songlist
 python download_karaoke.py --songlist-only --force-download-plan --refresh --limit 10
@ -172,6 +273,9 @@ python download_karaoke.py --parallel --workers 5 --songlist-only --limit 10
 # 1b. Focus on specific playlists (fast targeted download)
 python download_karaoke.py --songlist-focus "2025 - Apple Top 50" --limit 5

+# 1c. Force download from specific playlists (bypass all existing file checks)
+python download_karaoke.py --songlist-focus "2025 - Apple Top 50" --force --limit 5
+
 # 2. Latest videos from all channels
 python download_karaoke.py --latest-per-channel --limit 5

@ -190,6 +294,9 @@ python download_karaoke.py --parallel --workers 5 --songlist-only --fuzzy-match
 # 4b. Focused fuzzy matching (target specific playlists with flexible matching)
 python download_karaoke.py --songlist-focus "2025 - Apple Top 50" --fuzzy-match --fuzzy-threshold 80 --limit 10

+# 4c. Force download with fuzzy matching (bypass all existing file checks)
+python download_karaoke.py --songlist-focus "2025 - Apple Top 50" --force --fuzzy-match --fuzzy-threshold 80 --limit 10
+
 # 5. Reset and start fresh
 python download_karaoke.py --reset-channel SingKingKaraoke --reset-songlist

@ -197,6 +304,33 @@ python download_karaoke.py --reset-channel SingKingKaraoke --reset-songlist
 python download_karaoke.py --status
 python download_karaoke.py --clear-cache all

+# 7. Download from manual video collection
+python download_karaoke.py --manual --limit 5
+
+# 7b. Fast parallel manual video download
+python download_karaoke.py --parallel --workers 3 --manual --limit 5
+
+# 7c. Manual videos with fuzzy matching
+python download_karaoke.py --manual --fuzzy-match --fuzzy-threshold 85 --limit 10
+
+## 🍎 macOS SETUP COMMANDS
+
+# Automatic macOS setup (detects OS and installs yt-dlp + FFmpeg)
+python3 setup_macos.py
+
+# Test macOS setup and functionality
+python3 src/tests/test_macos.py
+
+# Manual macOS setup options
+# Install yt-dlp via pip
+pip3 install yt-dlp
+
+# Download yt-dlp binary for macOS
+mkdir -p downloader && curl -L -o downloader/yt-dlp_macos https://github.com/yt-dlp/yt-dlp/releases/latest/download/yt-dlp_macos && chmod +x downloader/yt-dlp_macos
+
+# Install FFmpeg via Homebrew
+brew install ffmpeg
+
 ## 🔧 TROUBLESHOOTING COMMANDS

 # Check if everything is working
@ -212,7 +346,9 @@ python download_karaoke.py --clear-server-duplicates
 ## 📝 NOTES

 # Default files used:
-# - data/channels.txt (default channel list for songlist modes)
+# - data/channels.json (channel configuration with parsing rules, preferred)
+# - data/channels.json (channel configuration with parsing rules)
+# - data/manual_videos.json (manual video collection)
 # - data/songList.json (your prioritized song list)
 # - data/config.json (download settings)

@ -221,11 +357,12 @@ python download_karaoke.py --clear-server-duplicates
 # Fuzzy threshold: 0-100 (higher = more strict matching, default 90)

 # The system automatically:
-# - Uses data/channels.txt if no --file specified in songlist modes
+# - Uses data/channels.json for channel configuration and parsing rules
 # - Caches channel data for 24 hours (configurable)
 # - Tracks all downloads in JSON files
 # - Avoids re-downloading existing files
 # - Checks for server duplicates
+# - Supports manual video collection via --manual parameter

 # For best performance:
 # - Use --parallel --workers 5 for 3-5x faster downloads
@ -233,6 +370,7 @@ python download_karaoke.py --clear-server-duplicates
 # - Use --fuzzy-match for better song discovery
 # - Use --refresh sparingly (forces re-scan)
 # - Clear cache if you encounter issues
+# - macOS users: Run `python3 setup_macos.py` for automatic setup

 # Parallel download tips:
 # - Start with --workers 3 for conservative approach
--- a/config/config.json
+++ b/config/config.json
@ -19,13 +19,14 @@
    "writethumbnail": false,
    "embed_metadata": false,
    "continuedl": true,
-    "nooverwrites": true,
+    "nooverwrites": false,
    "ignoreerrors": true,
    "no_warnings": false
 },
  "folder_structure": {
    "downloads_dir": "downloads",
    "logs_dir": "logs",
+    "data_dir": "data",
    "tracking_file": "downloaded_videos.json"
  },
  "logging": {
@ -34,5 +35,12 @@
    "include_console": true,
    "include_file": true
  },
+  "platform_settings": {
+    "auto_detect_platform": true,
+    "yt_dlp_paths": {
+      "windows": "downloader/yt-dlp.exe",
+      "macos": "downloader/yt-dlp_macos"
+    }
+  },
  "yt_dlp_path": "downloader/yt-dlp.exe"
 } 
--- a/data/bak_songList.json
+++ b/data/bak_songList.json
--- a/data/channel_cache.json
+++ b/data/channel_cache.json
--- a/data/channel_cache/@KaraokeOnVEVO.json
+++ b/data/channel_cache/@KaraokeOnVEVO.json
--- a/data/channel_cache/@KaraokeOnVEVO_raw_output.txt
+++ b/data/channel_cache/@KaraokeOnVEVO_raw_output.txt
--- a/data/channel_cache/@LetsSingKaraoke.json
+++ b/data/channel_cache/@LetsSingKaraoke.json
@ -0,0 +1,19 @@
+{
+  "channel_id": "@LetsSingKaraoke",
+  "videos": [
+    {
+      "title": "Sub Urban - Cradles | Karaoke (instrumental)",
+      "id": "8uj7IzhdiO4"
+    },
+    {
+      "title": "Sia - Snowman | Karaoke (instrumental)",
+      "id": "ZbWHuncTgsM"
+    },
+    {
+      "title": "Trevor Daniel - Falling | Karaoke (Instrumental)",
+      "id": "nU7n2aq7f98"
+    }
+  ],
+  "last_updated": "2025-08-05T15:59:09.280488",
+  "video_count": 3
+}
--- a/data/channel_cache/@LetsSingKaraoke_raw_output.txt
+++ b/data/channel_cache/@LetsSingKaraoke_raw_output.txt
@ -0,0 +1,10 @@
+# Raw yt-dlp output for @LetsSingKaraoke
+# Channel URL: https://www.youtube.com/@LetsSingKaraoke/videos
+# Command: downloader/yt-dlp_macos --flat-playlist --print %(title)s|%(id)s|%(url)s --verbose https://www.youtube.com/@LetsSingKaraoke/videos
+# Timestamp: 2025-08-05T15:59:09.280155
+# Total lines: 3
+################################################################################
+
+     1: Sub Urban - Cradles | Karaoke (instrumental)|8uj7IzhdiO4|https://www.youtube.com/watch?v=8uj7IzhdiO4
+     2: Sia - Snowman | Karaoke (instrumental)|ZbWHuncTgsM|https://www.youtube.com/watch?v=ZbWHuncTgsM
+     3: Trevor Daniel - Falling | Karaoke (Instrumental)|nU7n2aq7f98|https://www.youtube.com/watch?v=nU7n2aq7f98
--- a/data/channel_cache/@SingKingKaraoke.json
+++ b/data/channel_cache/@SingKingKaraoke.json
--- a/data/channel_cache/@SingKingKaraoke_raw_output.txt
+++ b/data/channel_cache/@SingKingKaraoke_raw_output.txt
--- a/data/channel_cache/@StingrayKaraoke.json
+++ b/data/channel_cache/@StingrayKaraoke.json
--- a/data/channel_cache/@StingrayKaraoke_raw_output.txt
+++ b/data/channel_cache/@StingrayKaraoke_raw_output.txt
--- a/data/channel_cache/@VocalStarKaraoke.json
+++ b/data/channel_cache/@VocalStarKaraoke.json
--- a/data/channel_cache/@VocalStarKaraoke_raw_output.txt
+++ b/data/channel_cache/@VocalStarKaraoke_raw_output.txt
--- a/data/channel_cache/@ZoomKaraokeOfficial.json
+++ b/data/channel_cache/@ZoomKaraokeOfficial.json
--- a/data/channel_cache/@ZoomKaraokeOfficial_raw_output.txt
+++ b/data/channel_cache/@ZoomKaraokeOfficial_raw_output.txt
--- a/data/channel_cache/@sing2karaoke.json
+++ b/data/channel_cache/@sing2karaoke.json
--- a/data/channel_cache/@sing2karaoke_raw_output.txt
+++ b/data/channel_cache/@sing2karaoke_raw_output.txt
--- a/data/channels.json
+++ b/data/channels.json
@ -0,0 +1,191 @@
+{
+  "channels": [
+    {
+      "name": "@SingKingKaraoke",
+      "url": "https://www.youtube.com/@SingKingKaraoke/videos",
+      "parsing_rules": {
+        "format": "artist_title_separator",
+        "separator": " - ",
+        "artist_first": true,
+        "title_cleanup": {
+          "remove_suffix": {
+            "suffixes": ["(Karaoke)", "(Karaoke Version)", "Karaoke Version"]
+          }
+        },
+        "examples": [
+          "Artist - Title (Karaoke)",
+          "Artist - Title (Karaoke Version)"
+        ]
+      },
+      "description": "Standard artist - title format with karaoke suffix"
+    },
+    {
+      "name": "@KaraokeOnVEVO",
+      "url": "https://www.youtube.com/@KaraokeOnVEVO/videos",
+      "parsing_rules": {
+        "format": "artist_title_separator",
+        "separator": " - ",
+        "artist_first": true,
+        "title_cleanup": {
+          "remove_suffix": {
+            "suffixes": ["(Karaoke)"]
+          }
+        },
+        "examples": [
+          "George Jones - A Picture Of Me (Without You) (Karaoke)",
+          "Iggy Pop, Kate Pierson - Candy (Karaoke)"
+        ]
+      },
+      "description": "Standard artist - title format with (Karaoke) suffix"
+    },
+    {
+      "name": "@StingrayKaraoke",
+      "url": "https://www.youtube.com/@StingrayKaraoke/videos",
+      "parsing_rules": {
+        "format": "artist_title_separator",
+        "separator": " - ",
+        "artist_first": true,
+        "title_cleanup": {
+          "remove_suffix": {
+            "suffixes": ["(Karaoke Version)"]
+          }
+        },
+        "playlist_indicators": [
+          "TOP SONGS OF",
+          "THE BEST",
+          "BEST",
+          "NON-STOP",
+          "MASHUP",
+          "FEAT.",
+          "WITH LYRICS"
+        ],
+        "examples": [
+          "Gracie Abrams - That's So True (Karaoke Version)",
+          "TOP SONGS OF 2024 KARAOKE WITH LYRICS BY BILLIE EILISH, GRACIE ABRAMS & MORE"
+        ]
+      },
+      "description": "Standard artist - title format with (Karaoke Version) suffix, also has playlist titles"
+    },
+    {
+      "name": "@sing2karaoke",
+      "url": "https://www.youtube.com/@sing2karaoke/videos",
+      "parsing_rules": {
+        "format": "artist_title_spaces",
+        "separator": "   ",
+        "artist_first": true,
+        "title_cleanup": {
+          "remove_suffix": {
+            "suffixes": ["(Karaoke Version) Lyrics", "(Karaoke Version)", "Karaoke Version Lyrics"]
+          }
+        },
+        "multi_artist_separator": ", ",
+        "examples": [
+          "Lauren Spencer Smith  Fingers Crossed",
+          "Calvin Harris, Clementine Douglas  Blessings (Karaoke Version) Lyrics"
+        ]
+      },
+      "description": "Artist and title separated by multiple spaces, supports multiple artists"
+    },
+    {
+      "name": "@ZoomKaraokeOfficial",
+      "url": "https://www.youtube.com/@ZoomKaraokeOfficial/videos",
+      "parsing_rules": {
+        "format": "artist_title_separator",
+        "separator": " - ",
+        "artist_first": true,
+        "title_cleanup": {
+          "remove_suffix": {
+            "suffixes": [
+              "(Karaoke)", 
+              "(Karaoke Version)", 
+              "Karaoke Version", 
+              "- Karaoke Version from Zoom Karaoke",
+              "- Karaoke Version from Zoom",
+              "- Karaoke Version from Zoom Karaoke (Radiohead Cover)",
+              "- Karaoke Version from Zoom (Radiohead Cover)"
+            ]
+          }
+        },
+        "examples": [
+          "The Mavericks - Here Comes My Baby - Karaoke Version from Zoom Karaoke"
+        ]
+      },
+      "description": "Standard artist - title format with '- Karaoke Version from Zoom Karaoke' suffix"
+    },
+    {
+      "name": "@VocalStarKaraoke",
+      "url": "https://www.youtube.com/@VocalStarKaraoke/videos",
+      "parsing_rules": {
+        "format": "artist_title_separator",
+        "separator": " - ",
+        "artist_first": false,
+        "title_cleanup": {
+          "remove_suffix": {
+            "suffixes": ["KARAOKE Without Backing Vocals", "KARAOKE With Vocal Guide", "KARAOKE"]
+          }
+        },
+        "examples": [
+          "Don't Say You Love Me - Jin KARAOKE Without Backing Vocals",
+          "Don't Say You Love Me - Jin KARAOKE With Vocal Guide"
+        ]
+      },
+      "description": "Title first, then dash separator, then artist with KARAOKE suffix"
+    },
+    {
+      "name": "@ManualVideos",
+      "url": "manual://static",
+      "manual_videos_file": "data/manual_videos.json",
+      "parsing_rules": {
+        "format": "artist_title_separator",
+        "separator": " - ",
+        "artist_first": true,
+        "title_cleanup": {
+          "remove_suffix": {
+            "suffixes": ["(Karaoke)", "(Karaoke Version)", "(Karaoke Version) Lyrics"]
+          }
+        }
+      },
+      "description": "Manual collection of individual karaoke videos (static, never expires)"
+    },
+    {
+      "name": "Let's Sing Karaoke",
+      "url": "https://www.youtube.com/@LetsSingKaraoke/videos",
+      "parsing_rules": {
+        "format": "artist_title_separator",
+        "separator": " - ",
+        "artist_first": true,
+        "title_cleanup": {
+          "remove_suffix": {
+            "suffixes": ["(Karaoke)", "(Karaoke Version)", "Karaoke Version", "(In the style of)"]
+          }
+        },
+        "examples": [
+          "Artist - Title (Karaoke)",
+          "Artist - Title (In the style of Other Artist)"
+        ]
+      },
+      "artist_name_processing": true,
+      "description": "Let's Sing Karaoke with enhanced artist name processing"
+    }
+  ],
+  "global_parsing_settings": {
+    "fallback_format": "artist_title_separator",
+    "fallback_separator": " - ",
+    "common_suffixes": [
+      "(Karaoke)",
+      "(Karaoke Version)",
+      "Karaoke Version",
+      "(Karaoke Version) Lyrics",
+      "Karaoke Version Lyrics"
+    ],
+    "playlist_indicators": [
+      "TOP",
+      "BEST",
+      "MASHUP",
+      "FEAT.",
+      "WITH LYRICS",
+      "NON-STOP",
+      "PLAYLIST"
+    ]
+  }
+} 
--- a/data/channels.txt
+++ b/data/channels.txt
@ -1,7 +0,0 @@
-https://www.youtube.com/@SingKingKaraoke/videos 
-https://www.youtube.com/@karafun/videos
-https://www.youtube.com/@KaraokeOnVEVO/videos
-https://www.youtube.com/@StingrayKaraoke/videos
-https://www.youtube.com/@CCKaraoke/videos
-https://www.youtube.com/@AtomicKaraoke/videos
-https://www.youtube.com/@sing2karaoke/videos
--- a/data/karaoke_tracking.json
+++ b/data/karaoke_tracking.json
--- a/data/manual_videos.json
+++ b/data/manual_videos.json
@ -0,0 +1,85 @@
+{
+  "channel_name": "@ManualVideos",
+  "channel_url": "manual://static",
+  "description": "Manual collection of individual karaoke videos",
+  "videos": [
+    {
+      "title": "Nickelback - Photograph",
+      "url": "https://www.youtube.com/watch?v=qZXwpceqt9s",
+      "id": "qZXwpceqt9s",
+      "upload_date": "2024-01-01",
+      "duration": 180,
+      "view_count": 1000
+    },
+    {
+      "title": "Ed Sheeran & Beyoncé - Perfect Duet",
+      "url": "https://www.youtube.com/watch?v=qegLWI99Wg0",
+      "id": "qegLWI99Wg0",
+      "upload_date": "2024-01-01",
+      "duration": 180,
+      "view_count": 1000
+    },
+    {
+      "title": "10,000 Maniacs - More Than This",
+      "url": "https://www.youtube.com/watch?v=wxnuF-APJ5M",
+      "id": "wxnuF-APJ5M",
+      "upload_date": "2024-01-01",
+      "duration": 180,
+      "view_count": 1000
+    },
+    {
+      "title": "AC/DC - Big Balls",
+      "url": "https://www.youtube.com/watch?v=kiSDpVmu4Bk",
+      "id": "kiSDpVmu4Bk",
+      "upload_date": "2024-01-01",
+      "duration": 180,
+      "view_count": 1000
+    },
+    {
+      "title": "Jon Bon Jovi - Blaze of Glory",
+      "url": "https://www.youtube.com/watch?v=SzRAoDMlQY",
+      "id": "SzRAoDMlQY",
+      "upload_date": "2024-01-01",
+      "duration": 180,
+      "view_count": 1000
+    },
+    {
+      "title": "ZZ Top - Sharp Dressed Man",
+      "url": "https://www.youtube.com/watch?v=prRalwto9iY",
+      "id": "prRalwto9iY",
+      "upload_date": "2024-01-01",
+      "duration": 180,
+      "view_count": 1000
+    },
+    {
+      "title": "Nickelback - Photograph",
+      "url": "https://www.youtube.com/watch?v=qTphCTAUhUg",
+      "id": "qTphCTAUhUg",
+      "upload_date": "2024-01-01",
+      "duration": 180,
+      "view_count": 1000
+    },
+    {
+      "title": "Billy Joel - Shes Got A Way",
+      "url": "https://www.youtube.com/watch?v=DeeTFIgKuC8",
+      "id": "DeeTFIgKuC8",
+      "upload_date": "2024-01-01",
+      "duration": 180,
+      "view_count": 1000
+    }
+  ],
+  "parsing_rules": {
+    "format": "artist_title_separator",
+    "separator": " - ",
+    "artist_first": true,
+    "title_cleanup": {
+      "remove_suffix": {
+        "suffixes": [
+          "(Karaoke)",
+          "(Karaoke Version)",
+          "(Karaoke Version) Lyrics"
+        ]
+      }
+    }
+  }
+}
--- a/data/server_duplicates_tracking.json
+++ b/data/server_duplicates_tracking.json
--- a/data/songList.json
+++ b/data/songList.json
@ -23902,7 +23902,7 @@
        "title": "Superman (It's Not Easy)"
      },
      {
-        "artist": "'N Sync",
+        "artist": "'NSync",
        "position": 16,
        "title": "Gone"
      },
@ -24122,7 +24122,7 @@
        "title": "Turn Off The Light"
      },
      {
-        "artist": "'N Sync",
+        "artist": "'NSync",
        "position": 13,
        "title": "Gone"
      },
@ -24617,7 +24617,7 @@
        "title": "Most Girls"
      },
      {
-        "artist": "'N Sync",
+        "artist": "'NSync",
        "position": 11,
        "title": "This I Promise You"
      },
@ -24857,7 +24857,7 @@
        "title": "I Just Wanna Love U (Give It 2 Me)"
      },
      {
-        "artist": "'N Sync",
+        "artist": "'NSync",
        "position": 12,
        "title": "This I Promise You"
      },
@ -25857,7 +25857,7 @@
        "title": "Tha Block Is Hot"
      },
      {
-        "artist": "'N Sync & Gloria Estefan",
+        "artist": "'NSync & Gloria Estefan",
        "position": 85,
        "title": "Music Of My Heart"
      },
@ -26237,7 +26237,7 @@
        "title": "Touch It"
      },
      {
-        "artist": "N Sync",
+        "artist": "NSync",
        "position": 34,
        "title": "(God Must Have Spent) A Little More Time On You"
      },
--- a/data/songlist_tracking.json
+++ b/data/songlist_tracking.json
--- a/downloader/yt-dlp_macos
+++ b/downloader/yt-dlp_macos
--- a/karaoke_downloader/cache_manager.py
+++ b/karaoke_downloader/cache_manager.py
@ -9,6 +9,8 @@ import json
 from datetime import datetime, timedelta
 from pathlib import Path

+from karaoke_downloader.data_path_manager import get_data_path_manager
+
 # Constants
 DEFAULT_CACHE_EXPIRATION_DAYS = 1
 DEFAULT_CACHE_FILENAME_LENGTH_LIMIT = 200  # Increased from 60
@ -37,7 +39,7 @@ def get_download_plan_cache_file(mode, **kwargs):
            + hashlib.md5(base.encode()).hexdigest()[:8]
        )

-    return Path(f"data/{base}.json")
+    return get_data_path_manager().get_path(f"{base}.json")


 def load_cached_plan(cache_file, max_age_days=DEFAULT_CACHE_EXPIRATION_DAYS):
--- a/karaoke_downloader/channel_parser.py
+++ b/karaoke_downloader/channel_parser.py
@ -0,0 +1,260 @@
+"""
+Channel-specific parsing utilities for extracting artist and title from video titles.
+
+This module handles the different title formats used by various karaoke channels,
+providing channel-specific parsing rules to extract artist and title information
+correctly for ID3 tagging and filename generation.
+"""
+
+import json
+import re
+from typing import Dict, List, Optional, Tuple, Any
+from pathlib import Path
+
+from karaoke_downloader.data_path_manager import get_data_path_manager
+
+
+class ChannelParser:
+    """Handles channel-specific parsing of video titles to extract artist and title."""
+    
+    def __init__(self, channels_file: str = None):
+        if channels_file is None:
+            channels_file = str(get_data_path_manager().get_channels_json_path())
+        """Initialize the parser with channel configuration."""
+        self.channels_file = Path(channels_file)
+        self.channels_config = self._load_channels_config()
+        
+    def _load_channels_config(self) -> Dict[str, Any]:
+        """Load the channels configuration from JSON file."""
+        if not self.channels_file.exists():
+            raise FileNotFoundError(f"Channels configuration file not found: {self.channels_file}")
+            
+        with open(self.channels_file, 'r', encoding='utf-8') as f:
+            return json.load(f)
+    
+    def get_channel_config(self, channel_name: str) -> Optional[Dict[str, Any]]:
+        """Get the configuration for a specific channel."""
+        for channel in self.channels_config.get("channels", []):
+            if channel["name"] == channel_name:
+                return channel
+        return None
+    
+    def extract_artist_title(self, video_title: str, channel_name: str) -> Tuple[str, str]:
+        """
+        Extract artist and title from a video title using channel-specific parsing rules.
+        
+        Args:
+            video_title: The full video title from YouTube
+            channel_name: The name of the channel (must match config)
+            
+        Returns:
+            Tuple of (artist, title) - both may be empty strings if parsing fails
+        """
+        channel_config = self.get_channel_config(channel_name)
+        if not channel_config:
+            # Fallback to global settings
+            return self._fallback_parse(video_title)
+            
+        parsing_rules = channel_config.get("parsing_rules", {})
+        format_type = parsing_rules.get("format", "artist_title_separator")
+        
+        if format_type == "artist_title_separator":
+            return self._parse_artist_title_separator(video_title, parsing_rules)
+        elif format_type == "artist_title_spaces":
+            return self._parse_artist_title_spaces(video_title, parsing_rules)
+        elif format_type == "title_artist_pipe":
+            return self._parse_title_artist_pipe(video_title, parsing_rules)
+        else:
+            return self._fallback_parse(video_title)
+    
+    def _parse_artist_title_separator(self, video_title: str, rules: Dict[str, Any]) -> Tuple[str, str]:
+        """Parse format: 'Artist - Title' or 'Title - Artist'."""
+        separator = rules.get("separator", " - ")
+        artist_first = rules.get("artist_first", True)
+        
+        if separator not in video_title:
+            return "", video_title.strip()
+            
+        parts = video_title.split(separator, 1)
+        if len(parts) != 2:
+            return "", video_title.strip()
+            
+        part1, part2 = parts[0].strip(), parts[1].strip()
+        
+        # Apply cleanup to both parts
+        part1_clean = self._cleanup_title(part1, rules.get("title_cleanup", {}))
+        part2_clean = self._cleanup_title(part2, rules.get("title_cleanup", {}))
+        
+        if artist_first:
+            return part1_clean, part2_clean
+        else:
+            return part2_clean, part1_clean
+    
+    def _parse_artist_title_spaces(self, video_title: str, rules: Dict[str, Any]) -> Tuple[str, str]:
+        """Parse format: 'Artist   Title' (multiple spaces)."""
+        separator = rules.get("separator", "   ")
+        multi_artist_sep = rules.get("multi_artist_separator", ",  ")
+        
+        # Try multiple space patterns to handle inconsistent spacing
+        # Look for the LAST occurrence of multiple spaces to handle cases with commas
+        space_patterns = ["   ", "  ", "    "]  # 3, 2, 4 spaces
+        
+        for pattern in space_patterns:
+            if pattern in video_title:
+                # Split on the LAST occurrence of the pattern
+                last_index = video_title.rfind(pattern)
+                if last_index != -1:
+                    artist_part = video_title[:last_index].strip()
+                    title_part = video_title[last_index + len(pattern):].strip()
+                    
+                    # Handle multiple artists (e.g., "Artist1,  Artist2")
+                    if multi_artist_sep in artist_part:
+                        # Keep the full artist string as is
+                        artist = artist_part
+                    else:
+                        artist = artist_part
+                        
+                    title = self._cleanup_title(title_part, rules.get("title_cleanup", {}))
+                    
+                    return artist, title
+        
+        # Try dash patterns as fallback for inconsistent formatting
+        dash_patterns = [" - ", " – ", " -"]  # Regular dash, en dash, dash without trailing space
+        
+        for pattern in dash_patterns:
+            if pattern in video_title:
+                # Split on the LAST occurrence of the pattern
+                last_index = video_title.rfind(pattern)
+                if last_index != -1:
+                    artist_part = video_title[:last_index].strip()
+                    title_part = video_title[last_index + len(pattern):].strip()
+                    
+                    # Handle multiple artists (e.g., "Artist1,  Artist2")
+                    if multi_artist_sep in artist_part:
+                        # Keep the full artist string as is
+                        artist = artist_part
+                    else:
+                        artist = artist_part
+                        
+                    title = self._cleanup_title(title_part, rules.get("title_cleanup", {}))
+                    
+                    return artist, title
+        
+        # If no pattern matches, return empty artist and full title
+        return "", video_title.strip()
+    
+    def _parse_title_artist_pipe(self, video_title: str, rules: Dict[str, Any]) -> Tuple[str, str]:
+        """Parse format: 'Title | Artist'."""
+        separator = rules.get("separator", " | ")
+        
+        if separator not in video_title:
+            return "", video_title.strip()
+            
+        parts = video_title.split(separator, 1)
+        if len(parts) != 2:
+            return "", video_title.strip()
+            
+        title_part, artist_part = parts[0].strip(), parts[1].strip()
+        
+        title = self._cleanup_title(title_part, rules.get("title_cleanup", {}))
+        artist = self._cleanup_title(artist_part, rules.get("artist_cleanup", {}))
+        
+        return artist, title
+    
+    def _cleanup_title(self, text: str, cleanup_rules: Dict[str, Any]) -> str:
+        """Apply cleanup rules to remove suffixes and normalize text."""
+        if not cleanup_rules:
+            return text.strip()
+            
+        cleaned = text.strip()
+        
+        # Handle remove_suffix rule
+        if "remove_suffix" in cleanup_rules:
+            suffixes = cleanup_rules["remove_suffix"].get("suffixes", [])
+            for suffix in suffixes:
+                if cleaned.endswith(suffix):
+                    cleaned = cleaned[:-len(suffix)].strip()
+                    break
+                        
+        return cleaned
+    
+    def _fallback_parse(self, video_title: str) -> Tuple[str, str]:
+        """Fallback parsing using global settings."""
+        global_settings = self.channels_config.get("global_parsing_settings", {})
+        fallback_format = global_settings.get("fallback_format", "artist_title_separator")
+        fallback_separator = global_settings.get("fallback_separator", " - ")
+        
+        if fallback_format == "artist_title_separator":
+            if fallback_separator in video_title:
+                parts = video_title.split(fallback_separator, 1)
+                if len(parts) == 2:
+                    artist = parts[0].strip()
+                    title = parts[1].strip()
+                    # Apply global suffix cleanup
+                    for suffix in global_settings.get("common_suffixes", []):
+                        if title.endswith(suffix):
+                            title = title[:-len(suffix)].strip()
+                            break
+                    return artist, title
+        
+        # If all else fails, return empty artist and full title
+        return "", video_title.strip()
+    
+    def is_playlist_title(self, video_title: str, channel_name: str) -> bool:
+        """Check if a video title appears to be a playlist rather than a single song."""
+        channel_config = self.get_channel_config(channel_name)
+        if not channel_config:
+            return self._is_playlist_by_global_rules(video_title)
+            
+        parsing_rules = channel_config.get("parsing_rules", {})
+        playlist_indicators = parsing_rules.get("playlist_indicators", [])
+        
+        if not playlist_indicators:
+            return self._is_playlist_by_global_rules(video_title)
+            
+        title_upper = video_title.upper()
+        for indicator in playlist_indicators:
+            if indicator.upper() in title_upper:
+                return True
+                
+        return False
+    
+    def _is_playlist_by_global_rules(self, video_title: str) -> bool:
+        """Check if title is a playlist using global rules."""
+        global_settings = self.channels_config.get("global_parsing_settings", {})
+        playlist_indicators = global_settings.get("playlist_indicators", [])
+        
+        title_upper = video_title.upper()
+        for indicator in playlist_indicators:
+            if indicator.upper() in title_upper:
+                return True
+                
+        return False
+    
+    def get_all_channel_names(self) -> List[str]:
+        """Get a list of all configured channel names."""
+        return [channel["name"] for channel in self.channels_config.get("channels", [])]
+    
+    def get_channel_url(self, channel_name: str) -> Optional[str]:
+        """Get the URL for a specific channel."""
+        channel_config = self.get_channel_config(channel_name)
+        return channel_config.get("url") if channel_config else None
+
+
+# Convenience function for backward compatibility
+def extract_artist_title(video_title: str, channel_name: str, channels_file: str = None) -> Tuple[str, str]:
+    if channels_file is None:
+        channels_file = str(get_data_path_manager().get_channels_json_path())
+    """
+    Convenience function to extract artist and title from a video title.
+    
+    Args:
+        video_title: The full video title from YouTube
+        channel_name: The name of the channel
+        channels_file: Path to the channels configuration file
+        
+    Returns:
+        Tuple of (artist, title)
+    """
+    parser = ChannelParser(channels_file)
+    return parser.extract_artist_title(video_title, channel_name) 
--- a/karaoke_downloader/cli.py
+++ b/karaoke_downloader/cli.py
@ -1,27 +1,117 @@
+#!/usr/bin/env python3
+"""
+Karaoke Video Downloader CLI
+Command-line interface for the karaoke video downloader.
+"""
+
 import argparse
 import os
 import sys
-
 from pathlib import Path
+from typing import List

+from karaoke_downloader.channel_parser import ChannelParser
+from karaoke_downloader.config_manager import AppConfig
+from karaoke_downloader.data_path_manager import get_data_path_manager
 from karaoke_downloader.downloader import KaraokeDownloader

 # Constants
+DEFAULT_LATEST_PER_CHANNEL_LIMIT = 10
 DEFAULT_FUZZY_THRESHOLD = 85
-DEFAULT_LATEST_PER_CHANNEL_LIMIT = 5
-DEFAULT_DISPLAY_LIMIT = 10
-DEFAULT_CACHE_DURATION_HOURS = 24
+
+
+def load_channels_from_json(channels_file: str = None) -> List[str]:
+    """
+    Load channel URLs from the new JSON format.
+    
+    Args:
+        channels_file: Path to the channels.json file (if None, uses default from config)
+        
+    Returns:
+        List of channel URLs
+    """
+    if channels_file is None:
+        channels_file = str(get_data_path_manager().get_channels_json_path())
+    
+    try:
+        parser = ChannelParser(channels_file)
+        channels = parser.channels_config.get("channels", [])
+        return [channel["url"] for channel in channels]
+    except Exception as e:
+        print(f"❌ Error loading channels from {channels_file}: {e}")
+        return []
+
+
+def load_channels_from_text(channels_file: str = None) -> List[str]:
+    """
+    Load channel URLs from the old text format (for backward compatibility).
+    
+    Args:
+        channels_file: Path to the channels.txt file (if None, uses default from config)
+        
+    Returns:
+        List of channel URLs
+    """
+    if channels_file is None:
+        channels_file = str(get_data_path_manager().get_channels_txt_path())
+    
+    try:
+        with open(channels_file, "r", encoding="utf-8") as f:
+            return [
+                line.strip()
+                for line in f
+                if line.strip() and not line.strip().startswith("#")
+            ]
+    except Exception as e:
+        print(f"❌ Error loading channels from {channels_file}: {e}")
+        return []
+
+
+def load_channels(channel_file: str = None) -> List[str]:
+    """Load channel URLs from file."""
+    if channel_file is None:
+        # Use JSON configuration
+        data_path_manager = get_data_path_manager()
+        if data_path_manager.file_exists("channels.json"):
+            return load_channels_from_json()
+        else:
+            return []
+    else:
+        if channel_file.endswith(".json"):
+            return load_channels_from_json(channel_file)
+        else:
+            return load_channels_from_text(channel_file)
+
+
+def get_channel_url_by_name(channel_name: str) -> str:
+    """Look up a channel URL by its name from the channels configuration."""
+    channel_urls = load_channels()
+    
+    # Normalize the channel name for comparison
+    normalized_name = channel_name.lower().replace("@", "").replace("karaoke", "").strip()
+    
+    for url in channel_urls:
+        # Extract channel name from URL
+        if "/@" in url:
+            url_channel_name = url.split("/@")[1].split("/")[0].lower()
+            if url_channel_name == normalized_name or url_channel_name.replace("karaoke", "").strip() == normalized_name:
+                return url
+    
+    return None


 def main():
    parser = argparse.ArgumentParser(
-        description="Karaoke Video Downloader - Download YouTube playlists and channel videos for karaoke",
+        description="Karaoke Video Downloader - Download YouTube playlists and channel videos for karaoke (default: downloads latest videos from all channels)",
        formatter_class=argparse.RawDescriptionHelpFormatter,
        epilog="""
 Examples:
-  python download_karaoke.py https://www.youtube.com/playlist?list=XYZ
-  python download_karaoke.py https://www.youtube.com/@SingKingKaraoke/videos
-  python download_karaoke.py --file data/channels.txt
+  python download_karaoke.py --limit 10                    # Download latest 10 videos from all channels
+  python download_karaoke.py --songlist-only --limit 10    # Download only songlist songs across channels
+  python download_karaoke.py --channel-focus SingKingKaraoke --limit 5  # Download from specific channel
+  python download_karaoke.py --channel-focus SingKingKaraoke --all-videos  # Download ALL videos from channel
+  python download_karaoke.py https://www.youtube.com/@SingKingKaraoke/videos  # Download from specific channel URL
+  python download_karaoke.py --file data/channels.txt      # Download from custom channel list
  python download_karaoke.py --reset-channel SingKingKaraoke --delete-files
        """,
    )
@ -92,13 +182,34 @@ Examples:
    parser.add_argument(
        "--songlist-priority",
        action="store_true",
-        help="Prioritize downloads based on data/songList.json (default: enabled)",
+        help="Prioritize downloads based on songList.json in the data directory (default: enabled)",
    )
    parser.add_argument(
        "--no-songlist-priority",
        action="store_true",
        help="Disable songlist prioritization",
    )
+    parser.add_argument(
+        "--generate-unmatched-report",
+        action="store_true",
+        help="Generate a report of songs that couldn't be found in any channel (runs after downloads)",
+    )
+    parser.add_argument(
+        "--show-pagination",
+        action="store_true",
+        help="Show page-by-page progress when downloading channel video lists (slower but more detailed)",
+    )
+    parser.add_argument(
+        "--parallel-channels",
+        action="store_true",
+        help="Enable parallel channel scanning for faster channel processing (scans multiple channels simultaneously)",
+    )
+    parser.add_argument(
+        "--channel-workers",
+        type=int,
+        default=3,
+        help="Number of parallel channel scanning workers (default: 3, max: 10)",
+    )
    parser.add_argument(
        "--songlist-only",
        action="store_true",
@ -110,6 +221,16 @@ Examples:
        metavar="PLAYLIST_TITLE",
        help='Focus on specific playlists by title (e.g., --songlist-focus "2025 - Apple Top 50" "2024 - Billboard Hot 100")',
    )
+    parser.add_argument(
+        "--songlist-file",
+        metavar="FILE_PATH",
+        help="Custom songlist file path to use with --songlist-focus (default: songList.json in the data directory)",
+    )
+    parser.add_argument(
+        "--force",
+        action="store_true",
+        help="Force download from channels regardless of whether songs are already downloaded, on server, or marked as duplicates",
+    )
    parser.add_argument(
        "--songlist-status",
        action="store_true",
@ -146,7 +267,7 @@ Examples:
    parser.add_argument(
        "--latest-per-channel",
        action="store_true",
-        help="Download the latest N videos from each channel (use with --limit)",
+        help="Download the latest N videos from each channel (use with --limit) [DEPRECATED: This is now the default behavior]",
    )
    parser.add_argument(
        "--fuzzy-match",
@ -156,19 +277,50 @@ Examples:
    parser.add_argument(
        "--fuzzy-threshold",
        type=int,
-        default=90,
-        help="Fuzzy match threshold (0-100, default 90)",
+        default=DEFAULT_FUZZY_THRESHOLD,
+        help=f"Fuzzy match threshold (0-100, default {DEFAULT_FUZZY_THRESHOLD})",
    )
    parser.add_argument(
        "--parallel",
        action="store_true",
-        help="Enable parallel downloads for improved speed",
+        help="Enable parallel downloads for improved speed (3-5x faster for large batches, defaults to 3 workers)",
    )
    parser.add_argument(
        "--workers",
        type=int,
        default=3,
-        help="Number of parallel download workers (default: 3, max: 10)",
+        help="Number of parallel download workers (default: 3, max: 10, only used with --parallel)",
+    )
+    parser.add_argument(
+        "--generate-songlist",
+        nargs="+",
+        metavar="DIRECTORY",
+        help="Generate song list from MP4 files with ID3 tags in specified directories",
+    )
+    parser.add_argument(
+        "--no-append-songlist",
+        action="store_true",
+        help="Create a new song list instead of appending when using --generate-songlist",
+    )
+    parser.add_argument(
+        "--manual",
+        action="store_true",
+        help="Download from manual videos collection (manual_videos.json in the data directory)",
+    )
+    parser.add_argument(
+        "--channel-focus",
+        type=str,
+        help="Download from a specific channel by name (e.g., 'SingKingKaraoke')",
+    )
+    parser.add_argument(
+        "--all-videos",
+        action="store_true",
+        help="Download all videos from channel (not just songlist matches), skipping existing files",
+    )
+    parser.add_argument(
+        "--dry-run",
+        action="store_true",
+        help="Build download plan and show what would be downloaded without actually downloading anything",
    )
    args = parser.parse_args()

@ -177,12 +329,42 @@ Examples:
        print("❌ Error: --workers must be between 1 and 10")
        sys.exit(1)

-    yt_dlp_path = Path("downloader/yt-dlp.exe")
-    if not yt_dlp_path.exists():
-        print("❌ Error: yt-dlp.exe not found in downloader/ directory")
-        print("Please ensure yt-dlp.exe is present in the downloader/ folder")
+    # Validate channel workers argument
+    if args.channel_workers < 1 or args.channel_workers > 10:
+        print("❌ Error: --channel-workers must be between 1 and 10")
        sys.exit(1)

+    # Load configuration to get platform-aware yt-dlp path
+    from karaoke_downloader.config_manager import load_config
+    config = load_config()
+    yt_dlp_path = config.yt_dlp_path
+    
+    # Check if it's a command string (like "python3 -m yt_dlp") or a file path
+    if yt_dlp_path.startswith(('python', 'python3')):
+        # It's a command string, test if it works
+        try:
+            import subprocess
+            cmd = yt_dlp_path.split() + ["--version"]
+            result = subprocess.run(cmd, capture_output=True, text=True, timeout=10)
+            if result.returncode != 0:
+                raise Exception(f"Command failed: {result.stderr}")
+        except Exception as e:
+            platform_name = "macOS" if sys.platform == "darwin" else "Windows"
+            print(f"❌ Error: yt-dlp command failed: {yt_dlp_path}")
+            print(f"Please ensure yt-dlp is properly installed for {platform_name}")
+            print(f"Error: {e}")
+            sys.exit(1)
+    else:
+        # It's a file path, check if it exists
+        yt_dlp_file = Path(yt_dlp_path)
+        if not yt_dlp_file.exists():
+            platform_name = "macOS" if sys.platform == "darwin" else "Windows"
+            binary_name = yt_dlp_file.name
+            print(f"❌ Error: {binary_name} not found in downloader/ directory")
+            print(f"Please ensure {binary_name} is present in the downloader/ folder for {platform_name}")
+            print(f"Expected path: {yt_dlp_file}")
+            sys.exit(1)
+
    downloader = KaraokeDownloader()
    
    # Set parallel download options
@ -210,9 +392,19 @@ Examples:
    if args.songlist_focus:
        downloader.songlist_focus_titles = args.songlist_focus
        downloader.songlist_only = True  # Enable songlist-only mode when focusing
+        args.songlist_only = True  # Also set the args flag to ensure CLI logic works
        print(
            f"🎯 Songlist focus mode enabled for playlists: {', '.join(args.songlist_focus)}"
        )
+    if args.songlist_file:
+        downloader.songlist_file_path = args.songlist_file
+        print(f"📁 Using custom songlist file: {args.songlist_file}")
+    if args.force:
+        downloader.force_download = True
+        print("💪 Force mode enabled - will download regardless of existing files or server duplicates")
+    if args.dry_run:
+        downloader.dry_run = True
+        print("🔍 Dry run mode enabled - will show download plan without downloading")
    if args.resolution != "720p":
        downloader.config_manager.update_resolution(args.resolution)

@ -226,17 +418,16 @@ Examples:
        sys.exit(0)
    # --- END NEW ---

-    # --- NEW: If no URL or file is provided, but --songlist-only is set, use all channels in data/channels.txt ---
-    if args.songlist_only and not args.url and not args.file:
-        channels_file = Path("data/channels.txt")
-        if channels_file.exists():
-            args.file = str(channels_file)
+    # --- NEW: If no URL or file is provided, but --songlist-only is set, use all channels ---
+    if (args.songlist_only or args.songlist_focus) and not args.url and not args.file:
+        channel_urls = load_channels()
+        if channel_urls:
            print(
-                "📋 No URL or --file provided, defaulting to all channels in data/channels.txt for songlist-only mode."
+                "📋 No URL or --file provided, defaulting to all configured channels for songlist mode."
            )
        else:
            print(
-                "❌ No URL, --file, or data/channels.txt found. Please provide a channel URL or a file with channel URLs."
+                "❌ No URL, --file, or channel configuration found. Please provide a channel URL or create channels.json in the data directory."
            )
            sys.exit(1)
    # --- END NEW ---
@ -256,6 +447,22 @@ Examples:
        print("ℹ️  Songs will be re-checked against the server on next run.")
        sys.exit(0)

+    if args.generate_songlist:
+        from karaoke_downloader.songlist_generator import SongListGenerator
+
+        print("🎵 Generating song list from MP4 files with ID3 tags...")
+        generator = SongListGenerator()
+        try:
+            generator.generate_songlist_from_multiple_directories(
+                args.generate_songlist,
+                append=not args.no_append_songlist
+            )
+            print("✅ Song list generation completed successfully!")
+        except Exception as e:
+            print(f"❌ Error generating song list: {e}")
+            sys.exit(1)
+        sys.exit(0)
+
    if args.status:
        stats = downloader.tracker.get_statistics()
        print("🎤 Karaoke Downloader Status")
@ -273,9 +480,10 @@ Examples:
        print("💾 Channel Cache Information")
        print("=" * 40)
        print(f"Total Channels: {cache_info['total_channels']}")
-        print(f"Total Cached Videos: {cache_info['total_cached_videos']}")
-        print(f"Cache Duration: {cache_info['cache_duration_hours']} hours")
-        print(f"Last Updated: {cache_info['last_updated']}")
+        print(f"Total Cached Videos: {cache_info['total_videos']}")
+        print("\n📋 Channel Details:")
+        for channel in cache_info['channels']:
+            print(f"   • {channel['channel']}: {channel['videos']} videos (updated: {channel['last_updated']})")
        sys.exit(0)
    elif args.clear_cache:
        if args.clear_cache == "all":
@ -315,47 +523,77 @@ Examples:
            if len(tracking) > 10:
                print(f"   ... and {len(tracking) - 10} more")
        sys.exit(0)
-    elif args.songlist_only or args.songlist_focus:
-        # Use provided file or default to data/channels.txt
-        channel_file = args.file if args.file else "data/channels.txt"
-        if not os.path.exists(channel_file):
-            print(f"❌ Channel file not found: {channel_file}")
+    elif args.manual:
+        # Download from manual videos collection
+        print("🎤 Downloading from manual videos collection...")
+        success = downloader.download_channel_videos(
+            "manual://static",
+            force_refresh=args.refresh,
+            fuzzy_match=args.fuzzy_match,
+            fuzzy_threshold=args.fuzzy_threshold,
+            force_download=args.force,
+        )
+    elif args.channel_focus:
+        # Download from a specific channel by name
+        print(f"🎤 Looking up channel: {args.channel_focus}")
+        channel_url = get_channel_url_by_name(args.channel_focus)
+        
+        if not channel_url:
+            print(f"❌ Channel '{args.channel_focus}' not found in configuration")
+            print("Available channels:")
+            channel_urls = load_channels()
+            for url in channel_urls:
+                if "/@" in url:
+                    channel_name = url.split("/@")[1].split("/")[0]
+                    print(f"   • {channel_name}")
+            sys.exit(1)
+        
+        if args.all_videos:
+            # Download ALL videos from the channel (not just songlist matches)
+            print(f"🎤 Downloading ALL videos from channel: {args.channel_focus} ({channel_url})")
+            success = downloader.download_all_channel_videos(
+                channel_url,
+                force_refresh=args.refresh,
+                force_download=args.force,
+                limit=args.limit,
+                dry_run=args.dry_run,
+            )
+        else:
+            # Download only songlist matches from the channel
+            print(f"🎤 Downloading from channel: {args.channel_focus} ({channel_url})")
+            success = downloader.download_channel_videos(
+                channel_url,
+                force_refresh=args.refresh,
+                fuzzy_match=args.fuzzy_match,
+                fuzzy_threshold=args.fuzzy_threshold,
+                force_download=args.force,
+                dry_run=args.dry_run,
+            )
+    elif args.songlist_only or args.songlist_focus:
+        # Use provided file or default to channels configuration
+        channel_urls = load_channels(args.file)
+        if not channel_urls:
+            print(f"❌ No channels found in configuration")
            sys.exit(1)
-        with open(channel_file, "r", encoding="utf-8") as f:
-            channel_urls = [
-                line.strip()
-                for line in f
-                if line.strip() and not line.strip().startswith("#")
-            ]
        limit = args.limit if args.limit else None
-        force_refresh_download_plan = (
-            args.force_download_plan if hasattr(args, "force_download_plan") else False
-        )
-        fuzzy_match = args.fuzzy_match if hasattr(args, "fuzzy_match") else False
-        fuzzy_threshold = (
-            args.fuzzy_threshold
-            if hasattr(args, "fuzzy_threshold")
-            else DEFAULT_FUZZY_THRESHOLD
-        )
        success = downloader.download_songlist_across_channels(
            channel_urls,
-            limit=limit,
-            force_refresh_download_plan=force_refresh_download_plan,
-            fuzzy_match=fuzzy_match,
-            fuzzy_threshold=fuzzy_threshold,
+            limit=args.limit,
+            force_refresh_download_plan=args.force_download_plan if hasattr(args, "force_download_plan") else False,
+            fuzzy_match=args.fuzzy_match,
+            fuzzy_threshold=args.fuzzy_threshold,
+            force_download=args.force,
+            show_pagination=args.show_pagination,
+            parallel_channels=args.parallel_channels,
+            max_channel_workers=args.channel_workers,
+            dry_run=args.dry_run,
        )
    elif args.latest_per_channel:
-        # Use provided file or default to data/channels.txt
-        channel_file = args.file if args.file else "data/channels.txt"
-        if not os.path.exists(channel_file):
-            print(f"❌ Channel file not found: {channel_file}")
+        # Use provided file or default to channels configuration
+        channel_urls = load_channels(args.file)
+        if not channel_urls:
+            print(f"❌ No channels found in configuration")
            sys.exit(1)
-        with open(channel_file, "r", encoding="utf-8") as f:
-            channel_urls = [
-                line.strip()
-                for line in f
-                if line.strip() and not line.strip().startswith("#")
-            ]
        limit = args.limit if args.limit else DEFAULT_LATEST_PER_CHANNEL_LIMIT
        force_refresh_download_plan = (
            args.force_download_plan if hasattr(args, "force_download_plan") else False
@ -372,14 +610,156 @@ Examples:
            force_refresh_download_plan=force_refresh_download_plan,
            fuzzy_match=fuzzy_match,
            fuzzy_threshold=fuzzy_threshold,
+            force_download=args.force,
+            dry_run=args.dry_run,
        )
    elif args.url:
        success = downloader.download_channel_videos(
-            args.url, force_refresh=args.refresh
+            args.url, force_refresh=args.refresh, dry_run=args.dry_run
        )
    else:
-        parser.print_help()
-        sys.exit(1)
+        # Default behavior: download from channels (equivalent to --latest-per-channel)
+        print("🎯 No specific mode specified, defaulting to download from channels")
+        channel_urls = load_channels(args.file)
+        if not channel_urls:
+            print(f"❌ No channels found in configuration")
+            print("Please provide a channel URL or create channels.json in the data directory")
+            sys.exit(1)
+        limit = args.limit if args.limit else DEFAULT_LATEST_PER_CHANNEL_LIMIT
+        force_refresh_download_plan = (
+            args.force_download_plan if hasattr(args, "force_download_plan") else False
+        )
+        fuzzy_match = args.fuzzy_match if hasattr(args, "fuzzy_match") else False
+        fuzzy_threshold = (
+            args.fuzzy_threshold
+            if hasattr(args, "fuzzy_threshold")
+            else DEFAULT_FUZZY_THRESHOLD
+        )
+        success = downloader.download_latest_per_channel(
+            channel_urls,
+            limit=limit,
+            force_refresh_download_plan=force_refresh_download_plan,
+            fuzzy_match=fuzzy_match,
+            fuzzy_threshold=fuzzy_threshold,
+            force_download=args.force,
+            dry_run=args.dry_run,
+        )
+    
+    # Generate unmatched report if requested (additive feature)
+    if args.generate_unmatched_report:
+        from karaoke_downloader.download_planner import generate_unmatched_report, build_download_plan
+        from karaoke_downloader.songlist_manager import load_songlist
+        
+        print("\n🔍 Generating unmatched songs report...")
+        
+        # Load songlist based on focus mode
+        if args.songlist_focus:
+            # Load focused playlists
+            songlist_file_path = args.songlist_file if args.songlist_file else str(get_data_path_manager().get_songlist_path())
+            songlist_file = Path(songlist_file_path)
+            if not songlist_file.exists():
+                print(f"⚠️ Songlist file not found: {songlist_file_path}")
+            else:
+                try:
+                    with open(songlist_file, "r", encoding="utf-8") as f:
+                        raw_data = json.load(f)
+
+                    # Filter playlists by title
+                    focused_playlists = []
+                    for playlist in raw_data:
+                        playlist_title = playlist.get("title", "")
+                        if playlist_title in args.songlist_focus:
+                            focused_playlists.append(playlist)
+
+                    if focused_playlists:
+                        # Flatten the focused playlists into songs
+                        focused_songs = []
+                        seen = set()
+                        for playlist in focused_playlists:
+                            if "songs" in playlist:
+                                for song in playlist["songs"]:
+                                    if "artist" in song and "title" in song:
+                                        artist = song["artist"].strip()
+                                        title = song["title"].strip()
+                                        key = f"{artist.lower()}_{title.lower()}"
+                                        if key in seen:
+                                            continue
+                                        seen.add(key)
+                                        focused_songs.append(
+                                            {
+                                                "artist": artist,
+                                                "title": title,
+                                                "position": song.get("position", 0),
+                                            }
+                                        )
+
+                        songlist = focused_songs
+                    else:
+                        print(f"⚠️ No playlists found matching: {', '.join(args.songlist_focus)}")
+                        songlist = []
+
+                except (json.JSONDecodeError, FileNotFoundError) as e:
+                    print(f"⚠️ Could not load songlist for report: {e}")
+                    songlist = []
+        else:
+            # Load all songs from songlist
+            songlist_path = args.songlist_file if args.songlist_file else str(get_data_path_manager().get_songlist_path())
+            songlist = load_songlist(songlist_path)
+        
+        if songlist:
+            # Load channel URLs
+            channel_file = args.file if args.file else str(get_data_path_manager().get_channels_txt_path())
+            if os.path.exists(channel_file):
+                with open(channel_file, "r", encoding='utf-8') as f:
+                    channel_urls = [
+                        line.strip()
+                        for line in f
+                        if line.strip() and not line.strip().startswith("#")
+                    ]
+                
+                print(f"📋 Analyzing {len(songlist)} songs against {len(channel_urls)} channels...")
+                
+                # Build download plan to get unmatched songs
+                fuzzy_match = args.fuzzy_match if hasattr(args, "fuzzy_match") else False
+                fuzzy_threshold = (
+                    args.fuzzy_threshold
+                    if hasattr(args, "fuzzy_threshold")
+                    else DEFAULT_FUZZY_THRESHOLD
+                )
+                
+                try:
+                    download_plan, unmatched = build_download_plan(
+                        channel_urls,
+                        songlist,
+                        downloader.tracker,
+                        downloader.yt_dlp_path,
+                        fuzzy_match=fuzzy_match,
+                        fuzzy_threshold=fuzzy_threshold,
+                    )
+                    
+                    if unmatched:
+                        report_file = generate_unmatched_report(unmatched)
+                        print(f"\n📋 Unmatched songs report generated successfully!")
+                        print(f"📁 Report saved to: {report_file}")
+                        print(f"📊 Summary: {len(download_plan)} songs found, {len(unmatched)} songs not found")
+                        print(f"\n🔍 First 10 unmatched songs:")
+                        for i, song in enumerate(unmatched[:10], 1):
+                            print(f"   {i:2d}. {song['artist']} - {song['title']}")
+                        if len(unmatched) > 10:
+                            print(f"   ... and {len(unmatched) - 10} more songs")
+                    else:
+                        print(f"\n✅ All {len(songlist)} songs were found in the channels!")
+                        
+                except Exception as e:
+                    print(f"❌ Error generating report: {e}")
+            else:
+                print(f"❌ Channel file not found: {channel_file}")
+        else:
+            print("❌ No songlist available for report generation")
+    
+    # Initialize success variable
+    success = False
+    
    downloader.tracker.force_save()
    if success:
        print("\n🎤 All downloads completed successfully!")
--- a/karaoke_downloader/config_manager.py
+++ b/karaoke_downloader/config_manager.py
@ -4,6 +4,8 @@ Provides centralized configuration loading, validation, and management.
 """

 import json
+import platform
+import sys
 from dataclasses import dataclass, field
 from datetime import datetime
 from pathlib import Path
@ -34,6 +36,7 @@ DEFAULT_CONFIG = {
    "folder_structure": {
        "downloads_dir": "downloads",
        "logs_dir": "logs",
+        "data_dir": "data",
        "tracking_file": "data/karaoke_tracking.json",
    },
    "logging": {
@ -42,6 +45,13 @@ DEFAULT_CONFIG = {
        "include_console": True,
        "include_file": True,
    },
+    "platform_settings": {
+        "auto_detect_platform": True,
+        "yt_dlp_paths": {
+            "windows": "downloader/yt-dlp.exe",
+            "macos": "downloader/yt-dlp_macos"
+        }
+    },
    "yt_dlp_path": "downloader/yt-dlp.exe",
 }

@ -55,6 +65,23 @@ RESOLUTION_MAP = {
 }


+def detect_platform() -> str:
+    """Detect the current platform and return platform name."""
+    system = platform.system().lower()
+    if system == "windows":
+        return "windows"
+    elif system == "darwin":
+        return "macos"
+    else:
+        return "windows"  # Default to Windows for other platforms
+
+
+def get_platform_yt_dlp_path(platform_paths: Dict[str, str]) -> str:
+    """Get the appropriate yt-dlp path for the current platform."""
+    platform_name = detect_platform()
+    return platform_paths.get(platform_name, platform_paths.get("windows", "downloader/yt-dlp.exe"))
+
+
@dataclass
 class DownloadSettings:
    """Configuration for download settings."""
@ -109,6 +136,7 @@ class FolderStructure:

    downloads_dir: str = "downloads"
    logs_dir: str = "logs"
+    data_dir: str = "data"
    tracking_file: str = "data/karaoke_tracking.json"


@ -139,14 +167,21 @@ class ConfigManager:
    Manages application configuration with loading, validation, and caching.
    """

-    def __init__(self, config_file: Union[str, Path] = "data/config.json"):
+    def __init__(self, config_file: Union[str, Path] = "config/config.json", data_dir: Optional[str] = None):
        """
        Initialize the configuration manager.

        Args:
            config_file: Path to the configuration file
+            data_dir: Optional custom data directory path
        """
-        self.config_file = Path(config_file)
+        # If config_file is relative and data_dir is provided, make it relative to data_dir
+        if data_dir and not Path(config_file).is_absolute():
+            self.config_file = Path(data_dir) / config_file
+        else:
+            self.config_file = Path(config_file)
+        
+        self._data_dir = data_dir
        self._config: Optional[AppConfig] = None
        self._last_modified: Optional[datetime] = None

@ -234,11 +269,21 @@ class ConfigManager:
        folder_structure = FolderStructure(**config_data.get("folder_structure", {}))
        logging_config = LoggingConfig(**config_data.get("logging", {}))

+        # Handle platform-specific yt-dlp path
+        yt_dlp_path = config_data.get("yt_dlp_path", "downloader/yt-dlp.exe")
+        
+        # Check if platform auto-detection is enabled
+        platform_settings = config_data.get("platform_settings", {})
+        if platform_settings.get("auto_detect_platform", True):
+            platform_paths = platform_settings.get("yt_dlp_paths", {})
+            if platform_paths:
+                yt_dlp_path = get_platform_yt_dlp_path(platform_paths)
+
        return AppConfig(
            download_settings=download_settings,
            folder_structure=folder_structure,
            logging=logging_config,
-            yt_dlp_path=config_data.get("yt_dlp_path", "downloader/yt-dlp.exe"),
+            yt_dlp_path=yt_dlp_path,
            _config_file=self.config_file,
        )

@ -297,27 +342,35 @@ class ConfigManager:
 _config_manager: Optional[ConfigManager] = None


-def get_config_manager() -> ConfigManager:
+def get_config_manager(config_file: Optional[Union[str, Path]] = None, data_dir: Optional[str] = None) -> ConfigManager:
    """
    Get the global configuration manager instance.

+    Args:
+        config_file: Optional path to config file (default: "config.json" in root)
+        data_dir: Optional custom data directory path
+
    Returns:
        ConfigManager instance
    """
    global _config_manager
-    if _config_manager is None:
-        _config_manager = ConfigManager()
+    if _config_manager is None or config_file is not None or data_dir is not None:
+        if config_file is None:
+            config_file = "config/config.json"
+        _config_manager = ConfigManager(config_file, data_dir)
    return _config_manager


-def load_config(force_reload: bool = False) -> AppConfig:
+def load_config(force_reload: bool = False, config_file: Optional[Union[str, Path]] = None, data_dir: Optional[str] = None) -> AppConfig:
    """
    Load configuration using the global manager.

    Args:
        force_reload: Force reload even if file hasn't changed
+        config_file: Optional path to config file (default: "config.json" in root)
+        data_dir: Optional custom data directory path

    Returns:
        AppConfig instance
    """
-    return get_config_manager().load_config(force_reload)
+    return get_config_manager(config_file, data_dir).load_config(force_reload)
--- a/karaoke_downloader/data_path_manager.py
+++ b/karaoke_downloader/data_path_manager.py
@ -0,0 +1,184 @@
+"""
+Data path management utilities for the karaoke downloader.
+Provides centralized data directory path management and file path resolution.
+"""
+
+import os
+from pathlib import Path
+from typing import Optional
+
+from .config_manager import get_config_manager
+
+
+class DataPathManager:
+    """
+    Manages data directory paths and provides utilities for resolving file paths
+    relative to the configured data directory.
+    """
+    
+    def __init__(self, data_dir: Optional[str] = None):
+        """
+        Initialize the data path manager.
+        
+        Args:
+            data_dir: Optional custom data directory path. If None, uses config.
+        """
+        self._data_dir = data_dir
+        
+        # If a custom data directory is provided, look for config.json in that directory
+        if data_dir:
+            config_file = Path(data_dir) / "config.json"
+            self._config_manager = get_config_manager(str(config_file))
+        else:
+            # Otherwise, use the default config.json in the root directory
+            self._config_manager = get_config_manager()
+    
+    @property
+    def data_dir(self) -> Path:
+        """
+        Get the configured data directory path.
+        
+        Returns:
+            Path to the data directory
+        """
+        if self._data_dir:
+            return Path(self._data_dir)
+        
+        # Get from config
+        config = self._config_manager.get_config()
+        data_dir = getattr(config.folder_structure, 'data_dir', 'data')
+        return Path(data_dir)
+    
+    def get_path(self, filename: str) -> Path:
+        """
+        Get the full path to a file in the data directory.
+        
+        Args:
+            filename: Name of the file (e.g., 'config.json', 'channels.json')
+            
+        Returns:
+            Full path to the file
+        """
+        return self.data_dir / filename
+    
+    def get_channels_json_path(self) -> Path:
+        """Get path to channels.json file."""
+        return self.get_path('channels.json')
+    
+    def get_channels_txt_path(self) -> Path:
+        """Get path to channels.txt file."""
+        return self.get_path('channels.txt')
+    
+    def get_songlist_path(self) -> Path:
+        """Get path to songList.json file."""
+        return self.get_path('songList.json')
+    
+    def get_songlist_tracking_path(self) -> Path:
+        """Get path to songlist_tracking.json file."""
+        return self.get_path('songlist_tracking.json')
+    
+    def get_karaoke_tracking_path(self) -> Path:
+        """Get path to karaoke_tracking.json file."""
+        return self.get_path('karaoke_tracking.json')
+    
+    def get_server_duplicates_tracking_path(self) -> Path:
+        """Get path to server_duplicates_tracking.json file."""
+        return self.get_path('server_duplicates_tracking.json')
+    
+    def get_manual_videos_path(self) -> Path:
+        """Get path to manual_videos.json file."""
+        return self.get_path('manual_videos.json')
+    
+    def get_songs_path(self) -> Path:
+        """Get path to songs.json file."""
+        return self.get_path('songs.json')
+    
+    def get_channel_cache_dir(self) -> Path:
+        """Get path to channel_cache directory."""
+        return self.get_path('channel_cache')
+    
+    def get_channel_cache_path(self, channel_id: str) -> Path:
+        """Get path to a specific channel cache file."""
+        return self.get_channel_cache_dir() / f"{channel_id}.json"
+    
+    def get_download_plan_cache_path(self, plan_name: str, **kwargs) -> Path:
+        """Get path to download plan cache file."""
+        # Create a hash from kwargs for unique cache files
+        import hashlib
+        if kwargs:
+            kwargs_str = str(sorted(kwargs.items()))
+            hash_suffix = hashlib.md5(kwargs_str.encode()).hexdigest()[:8]
+            plan_name = f"{plan_name}_{hash_suffix}"
+        return self.get_path(f"plan_latest_per_channel_{plan_name}.json")
+    
+    def get_unmatched_report_path(self, timestamp: Optional[str] = None) -> Path:
+        """Get path to unmatched songs report file."""
+        if timestamp:
+            return self.get_path(f"unmatched_songs_report_{timestamp}.json")
+        return self.get_path("unmatched_songs_report.json")
+    
+    def ensure_data_dir_exists(self) -> None:
+        """Ensure the data directory exists."""
+        self.data_dir.mkdir(parents=True, exist_ok=True)
+    
+    def list_data_files(self) -> list:
+        """List all files in the data directory."""
+        if not self.data_dir.exists():
+            return []
+        
+        files = []
+        for file_path in self.data_dir.iterdir():
+            if file_path.is_file():
+                files.append(file_path.name)
+        return sorted(files)
+    
+    def file_exists(self, filename: str) -> bool:
+        """Check if a file exists in the data directory."""
+        return self.get_path(filename).exists()
+
+
+# Global data path manager instance
+_data_path_manager: Optional[DataPathManager] = None
+
+
+def get_data_path_manager(data_dir: Optional[str] = None) -> DataPathManager:
+    """
+    Get the global data path manager instance.
+    
+    Args:
+        data_dir: Optional custom data directory path
+        
+    Returns:
+        DataPathManager instance
+    """
+    global _data_path_manager
+    if _data_path_manager is None or data_dir is not None:
+        _data_path_manager = DataPathManager(data_dir)
+    return _data_path_manager
+
+
+def get_data_path(filename: str, data_dir: Optional[str] = None) -> Path:
+    """
+    Get the full path to a file in the data directory.
+    
+    Args:
+        filename: Name of the file
+        data_dir: Optional custom data directory path
+        
+    Returns:
+        Full path to the file
+    """
+    return get_data_path_manager(data_dir).get_path(filename)
+
+
+def get_data_dir(data_dir: Optional[str] = None) -> Path:
+    """
+    Get the configured data directory path.
+    
+    Args:
+        data_dir: Optional custom data directory path
+        
+    Returns:
+        Path to the data directory
+    """
+    return get_data_path_manager(data_dir).data_dir
--- a/karaoke_downloader/download_pipeline.py
+++ b/karaoke_downloader/download_pipeline.py
@ -20,6 +20,12 @@ from karaoke_downloader.youtube_utils import (
    execute_yt_dlp_command,
    show_available_formats,
 )
+from karaoke_downloader.file_utils import (
+    cleanup_temp_files,
+    get_unique_filename,
+    is_valid_mp4_file,
+    sanitize_filename,
+)


 class DownloadPipeline:
@ -63,9 +69,15 @@ class DownloadPipeline:
            True if successful, False otherwise
        """
        try:
-            # Step 1: Prepare file path
-            filename = sanitize_filename(artist, title)
-            output_path = self.downloads_dir / channel_name / filename
+            # Step 1: Prepare file path and check for existing files
+            output_path, file_exists = get_unique_filename(self.downloads_dir, channel_name, artist, title)
+            
+            if file_exists:
+                print(f"⏭️  Skipping download - file already exists: {output_path.name}")
+                # Still add tags and track the existing file
+                if self._add_tags(output_path, artist, title, channel_name):
+                    self._track_download(output_path, artist, title, video_id, channel_name)
+                return True

            # Step 2: Download video
            if not self._download_video(video_id, output_path, artist, title, channel_name):
@ -214,8 +226,10 @@ class DownloadPipeline:
    ) -> bool:
        """Step 3: Add ID3 tags to the downloaded file."""
        try:
+            # Use the same artist/title as the filename for consistency
+            # Don't add "(Karaoke Version)" to the ID3 tag title
            add_id3_tags(
-                output_path, f"{artist} - {title} (Karaoke Version)", channel_name
+                output_path, f"{artist} - {title}", channel_name
            )
            print(f"🏷️  Added ID3 tags: {artist} - {title}")
            return True
@ -283,9 +297,10 @@ class DownloadPipeline:
            video_title = video.get("title", "")

            # Extract artist and title from video title
-            from karaoke_downloader.id3_utils import extract_artist_title
+            from karaoke_downloader.channel_parser import ChannelParser

-            artist, title = extract_artist_title(video_title)
+            channel_parser = ChannelParser()
+            artist, title = channel_parser.extract_artist_title(video_title, channel_name)

            print(f"   ({i}/{total}) Processing: {artist} - {title}")

--- a/karaoke_downloader/download_planner.py
+++ b/karaoke_downloader/download_planner.py
@ -3,19 +3,31 @@ Download plan building utilities.
 Handles pre-scanning channels and building download plans.
 """

+import concurrent.futures
+import hashlib
+import json
+import sys
+from datetime import datetime
+from pathlib import Path
+from typing import Any, Dict, List, Optional, Tuple
+
 from karaoke_downloader.cache_manager import (
    delete_plan_cache,
    get_download_plan_cache_file,
    load_cached_plan,
    save_plan_cache,
 )
+# Import all fuzzy matching functions
 from karaoke_downloader.fuzzy_matcher import (
    create_song_key,
-    extract_artist_title,
+    create_video_key,
    get_similarity_function,
    is_exact_match,
    is_fuzzy_match,
+    normalize_title,
 )
+from karaoke_downloader.channel_parser import ChannelParser
+from karaoke_downloader.data_path_manager import get_data_path_manager
 from karaoke_downloader.youtube_utils import get_channel_info

 # Constants
@ -23,6 +35,156 @@ DEFAULT_FILENAME_LENGTH_LIMIT = 100
 DEFAULT_ARTIST_LENGTH_LIMIT = 30
 DEFAULT_TITLE_LENGTH_LIMIT = 60
 DEFAULT_FUZZY_THRESHOLD = 85
+DEFAULT_DISPLAY_LIMIT = 10
+
+
+def generate_unmatched_report(unmatched: List[Dict[str, Any]], report_path: str = None) -> str:
+    """
+    Generate a detailed report of unmatched songs and save it to a file.
+    
+    Args:
+        unmatched: List of unmatched songs from build_download_plan
+        report_path: Optional path to save the report (default: data/unmatched_songs_report.json)
+    
+    Returns:
+        Path to the saved report file
+    """
+    if report_path is None:
+        timestamp = datetime.now().strftime("%Y%m%d_%H%M%S")
+        report_path = str(get_data_path_manager().get_unmatched_report_path(timestamp))
+    
+    report_data = {
+        "generated_at": datetime.now().isoformat(),
+        "total_unmatched": len(unmatched),
+        "unmatched_songs": []
+    }
+    
+    for song in unmatched:
+        report_data["unmatched_songs"].append({
+            "artist": song["artist"],
+            "title": song["title"],
+            "position": song.get("position", 0),
+            "search_key": create_song_key(song["artist"], song["title"])
+        })
+    
+    # Sort by artist, then by title for easier reading
+    report_data["unmatched_songs"].sort(key=lambda x: (x["artist"].lower(), x["title"].lower()))
+    
+    # Ensure the data directory exists
+    report_file = Path(report_path)
+    report_file.parent.mkdir(parents=True, exist_ok=True)
+    
+    # Save the report
+    with open(report_file, 'w', encoding='utf-8') as f:
+        json.dump(report_data, f, indent=2, ensure_ascii=False)
+    
+    return str(report_file)
+
+
+def _scan_channel_for_matches(
+    channel_url,
+    channel_name,
+    channel_id,
+    song_keys,
+    song_lookup,
+    fuzzy_match,
+    fuzzy_threshold,
+    show_pagination,
+    yt_dlp_path,
+    tracker,
+):
+    """
+    Scan a single channel for matches (used in parallel processing).
+    
+    Args:
+        channel_url: URL of the channel to scan
+        channel_name: Name of the channel
+        channel_id: ID of the channel
+        song_keys: Set of song keys to match against
+        song_lookup: Dictionary mapping song keys to song data
+        fuzzy_match: Whether to use fuzzy matching
+        fuzzy_threshold: Threshold for fuzzy matching
+        show_pagination: Whether to show pagination progress
+        yt_dlp_path: Path to yt-dlp executable
+        tracker: Tracking manager instance
+        
+    Returns:
+        List of video matches found in this channel
+    """
+    print(f"\n🚦 Scanning channel: {channel_name} ({channel_url})")
+    
+    # Get channel info if not provided
+    if not channel_name or not channel_id:
+        channel_name, channel_id = get_channel_info(channel_url)
+    
+    # Fetch video list from channel
+    available_videos = tracker.get_channel_video_list(
+        channel_url, yt_dlp_path=str(yt_dlp_path), force_refresh=False, show_pagination=show_pagination
+    )
+    
+    print(f"   📊 Channel has {len(available_videos)} videos to scan")
+    
+    video_matches = []
+    
+    # Pre-process video titles for efficient matching
+    channel_parser = ChannelParser()
+    if fuzzy_match:
+        # For fuzzy matching, create normalized video keys
+        for video in available_videos:
+            v_artist, v_title = channel_parser.extract_artist_title(video["title"], channel_name)
+            video_key = create_song_key(v_artist, v_title)
+
+            # Find best match among remaining songs
+            best_match = None
+            best_score = 0
+            for song_key in song_keys:
+                if song_key in song_lookup:  # Only check unmatched songs
+                    score = get_similarity_function()(song_key, video_key)
+                    if score >= fuzzy_threshold and score > best_score:
+                        best_score = score
+                        best_match = song_key
+
+            if best_match:
+                song = song_lookup[best_match]
+                video_matches.append(
+                    {
+                        "artist": song["artist"],
+                        "title": song["title"],
+                        "channel_name": channel_name,
+                        "channel_url": channel_url,
+                        "video_id": video["id"],
+                        "video_title": video["title"],
+                        "match_score": best_score,
+                    }
+                )
+                # Remove matched song from future consideration
+                del song_lookup[best_match]
+                song_keys.remove(best_match)
+    else:
+        # For exact matching, use direct key comparison
+        for video in available_videos:
+            v_artist, v_title = channel_parser.extract_artist_title(video["title"], channel_name)
+            video_key = create_song_key(v_artist, v_title)
+
+            if video_key in song_keys:
+                song = song_lookup[video_key]
+                video_matches.append(
+                    {
+                        "artist": song["artist"],
+                        "title": song["title"],
+                        "channel_name": channel_name,
+                        "channel_url": channel_url,
+                        "video_id": video["id"],
+                        "video_title": video["title"],
+                        "match_score": 100,
+                    }
+                )
+                # Remove matched song from future consideration
+                del song_lookup[video_key]
+                song_keys.remove(video_key)
+    
+    print(f"   ✅ Found {len(video_matches)} matches in {channel_name}")
+    return video_matches


 def build_download_plan(
@ -32,6 +194,9 @@ def build_download_plan(
    yt_dlp_path,
    fuzzy_match=False,
    fuzzy_threshold=DEFAULT_FUZZY_THRESHOLD,
+    show_pagination=False,
+    parallel_channels=False,
+    max_channel_workers=3,
 ):
    """
    For each song in undownloaded, scan all channels for a match.
@ -52,85 +217,200 @@ def build_download_plan(
        song_keys.add(key)
        song_lookup[key] = song

-    for i, channel_url in enumerate(channel_urls, 1):
-        print(f"\n🚦 Starting channel {i}/{len(channel_urls)}: {channel_url}")
-        print(f"   🔍 Getting channel info...")
-        channel_name, channel_id = get_channel_info(channel_url)
-        print(f"   ✅ Channel info: {channel_name} (ID: {channel_id})")
-        print(f"   🔍 Fetching video list from channel...")
-        available_videos = tracker.get_channel_video_list(
-            channel_url, yt_dlp_path=str(yt_dlp_path), force_refresh=False
-        )
-        print(
-            f"   📊 Channel has {len(available_videos)} videos to scan against {len(undownloaded)} songlist songs"
-        )
-        matches_this_channel = 0
-        video_matches = []  # Initialize video_matches for this channel
+    if parallel_channels:
+        print(f"🚀 Running parallel channel scanning with {max_channel_workers} workers.")
+        
+        # Create a thread-safe copy of song data for parallel processing
+        import threading
+        song_keys_lock = threading.Lock()
+        song_lookup_lock = threading.Lock()
+        
+        def scan_channel_safe(channel_url):
+            """Thread-safe channel scanning function."""
+            print(f"\n🚦 Scanning channel: {channel_url}")
+            
+            # Get channel info
+            channel_name, channel_id = get_channel_info(channel_url)
+            print(f"   ✅ Channel info: {channel_name} (ID: {channel_id})")
+            
+            # Fetch video list from channel
+            available_videos = tracker.get_channel_video_list(
+                channel_url, yt_dlp_path=str(yt_dlp_path), force_refresh=False, show_pagination=show_pagination
+            )
+            print(f"   📊 Channel has {len(available_videos)} videos to scan")
+            
+            video_matches = []
+            
+            # Pre-process video titles for efficient matching
+            channel_parser = ChannelParser()
+            if fuzzy_match:
+                # For fuzzy matching, create normalized video keys
+                for video in available_videos:
+                    v_artist, v_title = channel_parser.extract_artist_title(video["title"], channel_name)
+                    video_key = create_song_key(v_artist, v_title)

-        # Pre-process video titles for efficient matching
-        if fuzzy_match:
-            # For fuzzy matching, create normalized video keys
-            for video in available_videos:
-                v_artist, v_title = extract_artist_title(video["title"])
-                video_key = create_song_key(v_artist, v_title)
+                    # Find best match among remaining songs (thread-safe)
+                    best_match = None
+                    best_score = 0
+                    with song_keys_lock:
+                        available_song_keys = list(song_keys)  # Copy for iteration
+                    
+                    for song_key in available_song_keys:
+                        with song_lookup_lock:
+                            if song_key in song_lookup:  # Only check unmatched songs
+                                score = get_similarity_function()(song_key, video_key)
+                                if score >= fuzzy_threshold and score > best_score:
+                                    best_score = score
+                                    best_match = song_key

-                # Find best match among remaining songs
-                best_match = None
-                best_score = 0
-                for song_key in song_keys:
-                    if song_key in song_lookup:  # Only check unmatched songs
-                        score = get_similarity_function()(song_key, video_key)
-                        if score >= fuzzy_threshold and score > best_score:
-                            best_score = score
-                            best_match = song_key
+                    if best_match:
+                        with song_lookup_lock:
+                            if best_match in song_lookup:  # Double-check it's still available
+                                song = song_lookup[best_match]
+                                video_matches.append(
+                                    {
+                                        "artist": song["artist"],
+                                        "title": song["title"],
+                                        "channel_name": channel_name,
+                                        "channel_url": channel_url,
+                                        "video_id": video["id"],
+                                        "video_title": video["title"],
+                                        "match_score": best_score,
+                                    }
+                                )
+                                # Remove matched song from future consideration
+                                del song_lookup[best_match]
+                                with song_keys_lock:
+                                    song_keys.discard(best_match)
+            else:
+                # For exact matching, use direct key comparison
+                for video in available_videos:
+                    v_artist, v_title = channel_parser.extract_artist_title(video["title"], channel_name)
+                    video_key = create_song_key(v_artist, v_title)

-                if best_match:
-                    song = song_lookup[best_match]
-                    video_matches.append(
-                        {
-                            "artist": song["artist"],
-                            "title": song["title"],
-                            "channel_name": channel_name,
-                            "channel_url": channel_url,
-                            "video_id": video["id"],
-                            "video_title": video["title"],
-                            "match_score": best_score,
-                        }
-                    )
-                    # Remove matched song from future consideration
-                    del song_lookup[best_match]
-                    song_keys.remove(best_match)
-                    matches_this_channel += 1
-        else:
-            # For exact matching, use direct key comparison
-            for video in available_videos:
-                v_artist, v_title = extract_artist_title(video["title"])
-                video_key = create_song_key(v_artist, v_title)
+                    with song_lookup_lock:
+                        if video_key in song_keys and video_key in song_lookup:
+                            song = song_lookup[video_key]
+                            video_matches.append(
+                                {
+                                    "artist": song["artist"],
+                                    "title": song["title"],
+                                    "channel_name": channel_name,
+                                    "channel_url": channel_url,
+                                    "video_id": video["id"],
+                                    "video_title": video["title"],
+                                    "match_score": 100,
+                                }
+                            )
+                            # Remove matched song from future consideration
+                            del song_lookup[video_key]
+                            with song_keys_lock:
+                                song_keys.discard(video_key)
+            
+            print(f"   ✅ Found {len(video_matches)} matches in {channel_name}")
+            return video_matches
+        
+        # Execute parallel channel scanning
+        with concurrent.futures.ThreadPoolExecutor(max_workers=max_channel_workers) as executor:
+            # Submit all channel scanning tasks
+            future_to_channel = {
+                executor.submit(scan_channel_safe, channel_url): channel_url 
+                for channel_url in channel_urls
+            }
+            
+            # Process results as they complete
+            for future in concurrent.futures.as_completed(future_to_channel):
+                channel_url = future_to_channel[future]
+                try:
+                    video_matches = future.result()
+                    plan.extend(video_matches)
+                    channel_name, _ = get_channel_info(channel_url)
+                    channel_match_counts[channel_name] = len(video_matches)
+                except Exception as e:
+                    print(f"⚠️ Error processing channel {channel_url}: {e}")
+                    channel_name, _ = get_channel_info(channel_url)
+                    channel_match_counts[channel_name] = 0
+    else:
+        for i, channel_url in enumerate(channel_urls, 1):
+            print(f"\n🚦 Starting channel {i}/{len(channel_urls)}: {channel_url}")
+            print(f"   🔍 Getting channel info...")
+            channel_name, channel_id = get_channel_info(channel_url)
+            print(f"   ✅ Channel info: {channel_name} (ID: {channel_id})")
+            print(f"   🔍 Fetching video list from channel...")
+            available_videos = tracker.get_channel_video_list(
+                channel_url, yt_dlp_path=str(yt_dlp_path), force_refresh=False, show_pagination=show_pagination
+            )
+            print(
+                f"   📊 Channel has {len(available_videos)} videos to scan against {len(undownloaded)} songlist songs"
+            )
+            matches_this_channel = 0
+            video_matches = []  # Initialize video_matches for this channel

-                if video_key in song_keys:
-                    song = song_lookup[video_key]
-                    video_matches.append(
-                        {
-                            "artist": song["artist"],
-                            "title": song["title"],
-                            "channel_name": channel_name,
-                            "channel_url": channel_url,
-                            "video_id": video["id"],
-                            "video_title": video["title"],
-                            "match_score": 100,
-                        }
-                    )
-                    # Remove matched song from future consideration
-                    del song_lookup[video_key]
-                    song_keys.remove(video_key)
-                    matches_this_channel += 1
+            # Pre-process video titles for efficient matching
+            channel_parser = ChannelParser()
+            if fuzzy_match:
+                # For fuzzy matching, create normalized video keys
+                for video in available_videos:
+                    v_artist, v_title = channel_parser.extract_artist_title(video["title"], channel_name)
+                    video_key = create_song_key(v_artist, v_title)

-        # Add matches to plan
-        plan.extend(video_matches)
+                    # Find best match among remaining songs
+                    best_match = None
+                    best_score = 0
+                    for song_key in song_keys:
+                        if song_key in song_lookup:  # Only check unmatched songs
+                            score = get_similarity_function()(song_key, video_key)
+                            if score >= fuzzy_threshold and score > best_score:
+                                best_score = score
+                                best_match = song_key

-        # Print match count once per channel
-        channel_match_counts[channel_name] = matches_this_channel
-        print(f"   → Found {matches_this_channel} songlist matches in this channel.")
+                    if best_match:
+                        song = song_lookup[best_match]
+                        video_matches.append(
+                            {
+                                "artist": song["artist"],
+                                "title": song["title"],
+                                "channel_name": channel_name,
+                                "channel_url": channel_url,
+                                "video_id": video["id"],
+                                "video_title": video["title"],
+                                "match_score": best_score,
+                            }
+                        )
+                        # Remove matched song from future consideration
+                        del song_lookup[best_match]
+                        song_keys.remove(best_match)
+                        matches_this_channel += 1
+            else:
+                # For exact matching, use direct key comparison
+                for video in available_videos:
+                    v_artist, v_title = channel_parser.extract_artist_title(video["title"], channel_name)
+                    video_key = create_song_key(v_artist, v_title)
+
+                    if video_key in song_keys:
+                        song = song_lookup[video_key]
+                        video_matches.append(
+                            {
+                                "artist": song["artist"],
+                                "title": song["title"],
+                                "channel_name": channel_name,
+                                "channel_url": channel_url,
+                                "video_id": video["id"],
+                                "video_title": video["title"],
+                                "match_score": 100,
+                            }
+                        )
+                        # Remove matched song from future consideration
+                        del song_lookup[video_key]
+                        song_keys.remove(video_key)
+                        matches_this_channel += 1
+
+            # Add matches to plan
+            plan.extend(video_matches)
+
+            # Print match count once per channel
+            channel_match_counts[channel_name] = matches_this_channel
+            print(f"   → Found {matches_this_channel} songlist matches in this channel.")

    # Remaining unmatched songs
    unmatched = list(song_lookup.values())
@ -143,4 +423,13 @@ def build_download_plan(
        f"   TOTAL: {sum(channel_match_counts.values())} matches across {len(channel_match_counts)} channels."
    )

+    # Generate unmatched songs report if there are any
+    if unmatched:
+        try:
+            report_file = generate_unmatched_report(unmatched)
+            print(f"\n📋 Unmatched songs report saved to: {report_file}")
+            print(f"📋 Total unmatched songs: {len(unmatched)}")
+        except Exception as e:
+            print(f"⚠️ Could not generate unmatched songs report: {e}")
+
    return plan, unmatched
--- a/karaoke_downloader/downloader.py
+++ b/karaoke_downloader/downloader.py
--- a/karaoke_downloader/file_utils.py
+++ b/karaoke_downloader/file_utils.py
@ -34,7 +34,6 @@ def sanitize_filename(
    # Clean up title
    safe_title = (
        title.replace("(From ", "")
-        .replace(")", "")
        .replace(" - ", " ")
        .replace(":", "")
    )
@ -54,12 +53,19 @@ def sanitize_filename(
    )
    safe_artist = safe_artist.strip()

-    # Create filename
-    filename = f"{safe_artist} - {safe_title}.mp4"
+    # Create filename - handle empty artist case
+    if not safe_artist or safe_artist.strip() == "":
+        # If no artist, just use the title
+        filename = f"{safe_title}.mp4"
+    else:
+        filename = f"{safe_artist} - {safe_title}.mp4"

    # Limit filename length if needed
    if len(filename) > max_length:
-        filename = f"{safe_artist[:DEFAULT_ARTIST_LENGTH_LIMIT]} - {safe_title[:DEFAULT_TITLE_LENGTH_LIMIT]}.mp4"
+        if not safe_artist or safe_artist.strip() == "":
+            filename = f"{safe_title[:DEFAULT_TITLE_LENGTH_LIMIT]}.mp4"
+        else:
+            filename = f"{safe_artist[:DEFAULT_ARTIST_LENGTH_LIMIT]} - {safe_title[:DEFAULT_TITLE_LENGTH_LIMIT]}.mp4"

    return filename

@ -81,11 +87,19 @@ def generate_possible_filenames(
    safe_title = sanitize_title_for_filenames(title)
    safe_artist = artist.replace("'", "").replace('"', "").strip()

-    return [
-        f"{safe_artist} - {safe_title}.mp4",  # Songlist mode
-        f"{channel_name} - {safe_title}.mp4",  # Latest-per-channel mode
-        f"{safe_artist} - {safe_title} (Karaoke Version).mp4",  # Channel videos mode
-    ]
+    # Handle empty artist case
+    if not safe_artist or safe_artist.strip() == "":
+        return [
+            f"{safe_title}.mp4",  # Songlist mode (no artist)
+            f"{channel_name} - {safe_title}.mp4",  # Latest-per-channel mode
+            f"{safe_title} (Karaoke Version).mp4",  # Channel videos mode (no artist)
+        ]
+    else:
+        return [
+            f"{safe_artist} - {safe_title}.mp4",  # Songlist mode
+            f"{channel_name} - {safe_title}.mp4",  # Latest-per-channel mode
+            f"{safe_artist} - {safe_title} (Karaoke Version).mp4",  # Channel videos mode
+        ]


 def sanitize_title_for_filenames(title: str) -> str:
@ -112,6 +126,7 @@ def check_file_exists_with_patterns(
 ) -> Tuple[bool, Optional[Path]]:
    """
    Check if a file exists using multiple possible filename patterns.
+    Also checks for files with (2), (3), etc. suffixes that yt-dlp might create.

    Args:
        downloads_dir: Base downloads directory
@ -130,15 +145,56 @@ def check_file_exists_with_patterns(
            # Apply length limits if needed
            safe_artist = artist.replace("'", "").replace('"', "").strip()
            safe_title = sanitize_title_for_filenames(title)
-            filename = f"{safe_artist[:DEFAULT_ARTIST_LENGTH_LIMIT]} - {safe_title[:DEFAULT_TITLE_LENGTH_LIMIT]}.mp4"
+            if not safe_artist or safe_artist.strip() == "":
+                filename = f"{safe_title[:DEFAULT_TITLE_LENGTH_LIMIT]}.mp4"
+            else:
+                filename = f"{safe_artist[:DEFAULT_ARTIST_LENGTH_LIMIT]} - {safe_title[:DEFAULT_TITLE_LENGTH_LIMIT]}.mp4"

+        # Check for exact filename match
        file_path = channel_dir / filename
        if file_path.exists() and file_path.stat().st_size > 0:
            return True, file_path

+        # Check for files with (2), (3), etc. suffixes
+        base_name = filename.replace(".mp4", "")
+        for suffix in range(2, 10):  # Check up to (9)
+            suffixed_filename = f"{base_name} ({suffix}).mp4"
+            suffixed_path = channel_dir / suffixed_filename
+            if suffixed_path.exists() and suffixed_path.stat().st_size > 0:
+                return True, suffixed_path
+
    return False, None


+def get_unique_filename(
+    downloads_dir: Path, channel_name: str, artist: str, title: str
+) -> Tuple[Path, bool]:
+    """
+    Get a unique filename for download, checking for existing files including duplicates.
+    
+    Args:
+        downloads_dir: Base downloads directory
+        channel_name: Channel name
+        artist: Song artist
+        title: Song title
+        
+    Returns:
+        Tuple of (file_path, is_existing) where is_existing indicates if a file already exists
+    """
+    filename = sanitize_filename(artist, title)
+    channel_dir = downloads_dir / channel_name
+    file_path = channel_dir / filename
+    
+    # Check if file already exists
+    exists, existing_path = check_file_exists_with_patterns(downloads_dir, channel_name, artist, title)
+    
+    if exists and existing_path:
+        print(f"📁 File already exists: {existing_path.name}")
+        return existing_path, True
+    
+    return file_path, False
+
+
 def ensure_directory_exists(directory: Path) -> None:
    """
    Ensure a directory exists, creating it if necessary.
--- a/karaoke_downloader/fuzzy_matcher.py
+++ b/karaoke_downloader/fuzzy_matcher.py
@ -32,10 +32,72 @@ def normalize_title(title):


 def extract_artist_title(video_title):
-    """Extract artist and title from video title."""
+    """
+    Extract artist and title from video title.
+    
+    This function handles multiple common video title formats found on YouTube karaoke channels:
+    
+    1. "Artist - Title" format: "38 Special - Hold On Loosely"
+    2. "Title Karaoke | Artist Karaoke Version" format: "Hold On Loosely Karaoke | 38 Special Karaoke Version"
+    3. "Title Artist KARAOKE" format: "Hold On Loosely 38 Special KARAOKE"
+    
+    Args:
+        video_title (str): The YouTube video title to parse
+        
+    Returns:
+        tuple: (artist, title) where artist and title are strings. If parsing fails,
+               artist will be empty string and title will be the full video title.
+               
+    Examples:
+        >>> extract_artist_title("38 Special - Hold On Loosely")
+        ("38 Special", "Hold On Loosely")
+        
+        >>> extract_artist_title("Hold On Loosely Karaoke | 38 Special Karaoke Version")
+        ("38 Special", "Hold On Loosely")
+        
+        >>> extract_artist_title("Unknown Format Video Title")
+        ("", "Unknown Format Video Title")
+    """
+    # Handle "Artist - Title" format
    if " - " in video_title:
        parts = video_title.split(" - ", 1)
        return parts[0].strip(), parts[1].strip()
+    
+    # Handle "Title Karaoke | Artist Karaoke Version" format
+    if " | " in video_title and "karaoke" in video_title.lower():
+        parts = video_title.split(" | ", 1)
+        title_part = parts[0].strip()
+        artist_part = parts[1].strip()
+        
+        # Clean up the parts
+        title = title_part.replace("Karaoke", "").strip()
+        artist = artist_part.replace("Karaoke Version", "").strip()
+        
+        return artist, title
+    
+    # Handle "Title Artist KARAOKE" format
+    if "karaoke" in video_title.lower():
+        # Try to find the artist by looking for common patterns
+        title_lower = video_title.lower()
+        
+        # Look for patterns like "Title Artist KARAOKE"
+        # This is a simplified approach - we'll need to improve this
+        words = video_title.split()
+        if len(words) >= 3:
+            # Assume the last word before "KARAOKE" is part of the artist
+            for i, word in enumerate(words):
+                if "karaoke" in word.lower():
+                    if i >= 2:
+                        # Everything before the last word before KARAOKE is title
+                        # Everything after is artist
+                        title = " ".join(words[:i-1])
+                        artist = " ".join(words[i-1:])
+                        return artist, title
+        
+        # If we can't parse it, return empty artist and full title
+        return "", video_title
+    
+    # Default: return empty artist and full title
    return "", video_title


--- a/karaoke_downloader/id3_utils.py
+++ b/karaoke_downloader/id3_utils.py
@ -7,17 +7,33 @@ except ImportError:
    MUTAGEN_AVAILABLE = False


-def extract_artist_title(video_title):
-    title = (
-        video_title.replace("(Karaoke Version)", "").replace("(Karaoke)", "").strip()
-    )
-    if " - " in title:
-        parts = title.split(" - ", 1)
-        if len(parts) == 2:
-            artist = parts[0].strip()
-            song_title = parts[1].strip()
-            return artist, song_title
-    return "Unknown Artist", title
+def clean_channel_name(channel_name: str) -> str:
+    """
+    Clean channel name for ID3 tagging by removing @ symbol and ensuring it's alpha-only.
+    
+    Args:
+        channel_name: Raw channel name (may contain @ symbol)
+        
+    Returns:
+        Cleaned channel name suitable for ID3 tags
+    """
+    # Remove @ symbol if present
+    if channel_name.startswith('@'):
+        channel_name = channel_name[1:]
+    
+    # Remove any non-alphanumeric characters and convert to single word
+    # Keep only letters, numbers, and spaces, then take the first word
+    cleaned = re.sub(r'[^a-zA-Z0-9\s]', '', channel_name)
+    words = cleaned.split()
+    if words:
+        return words[0]  # Return only the first word
+    
+    return "Unknown"
+
+
+# Import the enhanced extract_artist_title function from fuzzy_matcher.py
+# This ensures consistent parsing across all modules and supports multiple video title formats
+from karaoke_downloader.fuzzy_matcher import extract_artist_title


 def add_id3_tags(file_path, video_title, channel_name):
@ -26,12 +42,13 @@ def add_id3_tags(file_path, video_title, channel_name):
        return
    try:
        artist, title = extract_artist_title(video_title)
+        clean_channel = clean_channel_name(channel_name)
        mp4 = MP4(str(file_path))
        mp4["\xa9nam"] = title
        mp4["\xa9ART"] = artist
-        mp4["\xa9alb"] = f"{channel_name} Karaoke"
+        mp4["\xa9alb"] = clean_channel  # Use clean channel name only, no suffix
        mp4["\xa9gen"] = "Karaoke"
        mp4.save()
-        print(f"📝 Added ID3 tags: Artist='{artist}', Title='{title}'")
+        print(f"📝 Added ID3 tags: Artist='{artist}', Title='{title}', Album='{clean_channel}'")
    except Exception as e:
        print(f"⚠️ Could not add ID3 tags: {e}")
--- a/karaoke_downloader/manual_video_manager.py
+++ b/karaoke_downloader/manual_video_manager.py
@ -0,0 +1,83 @@
+"""
+Manual video manager for handling static video collections.
+"""
+
+import json
+from pathlib import Path
+from typing import Dict, List, Optional, Any
+
+from karaoke_downloader.data_path_manager import get_data_path_manager
+
+def load_manual_videos(manual_file: str = None) -> List[Dict[str, Any]]:
+    if manual_file is None:
+        manual_file = str(get_data_path_manager().get_manual_videos_path())
+    """
+    Load manual videos from the JSON file.
+    
+    Args:
+        manual_file: Path to manual videos JSON file
+        
+    Returns:
+        List of video dictionaries
+    """
+    manual_path = Path(manual_file)
+    
+    if not manual_path.exists():
+        print(f"⚠️  Manual videos file not found: {manual_file}")
+        return []
+    
+    try:
+        with open(manual_path, 'r', encoding='utf-8') as f:
+            data = json.load(f)
+        
+        videos = data.get("videos", [])
+        print(f"📋 Loaded {len(videos)} manual videos from {manual_file}")
+        return videos
+        
+    except Exception as e:
+        print(f"❌ Error loading manual videos: {e}")
+        return []
+
+def get_manual_videos_for_channel(channel_name: str, manual_file: str = None) -> List[Dict[str, Any]]:
+    if manual_file is None:
+        manual_file = str(get_data_path_manager().get_manual_videos_path())
+    """
+    Get manual videos for a specific channel.
+    
+    Args:
+        channel_name: Channel name (should be "@ManualVideos")
+        manual_file: Path to manual videos JSON file
+        
+    Returns:
+        List of video dictionaries
+    """
+    if channel_name != "@ManualVideos":
+        return []
+    
+    return load_manual_videos(manual_file)
+
+def is_manual_channel(channel_url: str) -> bool:
+    """
+    Check if a channel URL is a manual channel.
+    
+    Args:
+        channel_url: Channel URL
+        
+    Returns:
+        True if it's a manual channel
+    """
+    return channel_url == "manual://static"
+
+def get_manual_channel_info(channel_url: str) -> tuple[str, str]:
+    """
+    Get channel info for manual channels.
+    
+    Args:
+        channel_url: Channel URL
+        
+    Returns:
+        Tuple of (channel_name, channel_id)
+    """
+    if channel_url == "manual://static":
+        return "@ManualVideos", "manual"
+    return None, None 
--- a/karaoke_downloader/server_manager.py
+++ b/karaoke_downloader/server_manager.py
@ -7,28 +7,40 @@ import json
 from datetime import datetime
 from pathlib import Path

+from karaoke_downloader.data_path_manager import get_data_path_manager

-def load_server_songs(songs_path="data/songs.json"):
-    """Load the list of songs already available on the server."""
+
+def load_server_songs(songs_path=None):
+    if songs_path is None:
+        songs_path = str(get_data_path_manager().get_songs_path())
+    """Load the list of songs already available on the server with format information."""
    songs_file = Path(songs_path)
    if not songs_file.exists():
        print(f"⚠️ Server songs file not found: {songs_path}")
-        return set()
+        return {}
    try:
        with open(songs_file, "r", encoding="utf-8") as f:
            data = json.load(f)
-        server_songs = set()
+        server_songs = {}
        for song in data:
-            if "artist" in song and "title" in song:
+            if "artist" in song and "title" in song and "path" in song:
                artist = song["artist"].strip()
                title = song["title"].strip()
+                path = song["path"].strip()
                key = f"{artist.lower()}_{normalize_title(title)}"
-                server_songs.add(key)
+                server_songs[key] = {
+                    "artist": artist,
+                    "title": title,
+                    "path": path,
+                    "is_mp3": path.lower().endswith('.mp3'),
+                    "is_cdg": 'cdg' in path.lower(),
+                    "is_mp4": path.lower().endswith('.mp4')
+                }
        print(f"📋 Loaded {len(server_songs)} songs from server (songs.json)")
        return server_songs
    except (json.JSONDecodeError, FileNotFoundError) as e:
        print(f"⚠️ Could not load server songs: {e}")
-        return set()
+        return {}


 def is_song_on_server(server_songs, artist, title):
@ -37,9 +49,24 @@ def is_song_on_server(server_songs, artist, title):
    return key in server_songs


+def should_skip_server_song(server_songs, artist, title):
+    """Check if a song should be skipped because it's already available as MP4 on server.
+    Returns True if the song should be skipped (MP4 format), False if it should be downloaded (MP3/CDG format)."""
+    key = f"{artist.lower()}_{normalize_title(title)}"
+    if key not in server_songs:
+        return False  # Not on server, so don't skip
+    
+    song_info = server_songs[key]
+    # Skip if it's an MP4 file (video format)
+    # Don't skip if it's MP3 or in CDG folder (different format)
+    return song_info.get("is_mp4", False) and not song_info.get("is_cdg", False)
+
+
 def load_server_duplicates_tracking(
-    tracking_path="data/server_duplicates_tracking.json",
+    tracking_path=None,
 ):
+    if tracking_path is None:
+        tracking_path = str(get_data_path_manager().get_server_duplicates_tracking_path())
    """Load the tracking of songs found to be duplicates on the server."""
    tracking_file = Path(tracking_path)
    if not tracking_file.exists():
@ -53,8 +80,10 @@ def load_server_duplicates_tracking(


 def save_server_duplicates_tracking(
-    tracking, tracking_path="data/server_duplicates_tracking.json"
+    tracking, tracking_path=None
 ):
+    if tracking_path is None:
+        tracking_path = str(get_data_path_manager().get_server_duplicates_tracking_path())
    """Save the tracking of songs found to be duplicates on the server."""
    try:
        with open(tracking_path, "w", encoding="utf-8") as f:
@ -86,8 +115,9 @@ def mark_song_as_server_duplicate(tracking, artist, title, video_title, channel_
 def check_and_mark_server_duplicate(
    server_songs, server_duplicates_tracking, artist, title, video_title, channel_name
 ):
-    """Check if a song is on server and mark it as duplicate if so. Returns True if it's a duplicate."""
-    if is_song_on_server(server_songs, artist, title):
+    """Check if a song should be skipped because it's already available as MP4 on server and mark it as duplicate if so. 
+    Returns True if it should be skipped (MP4 format), False if it should be downloaded (MP3/CDG format)."""
+    if should_skip_server_song(server_songs, artist, title):
        if not is_song_marked_as_server_duplicate(
            server_duplicates_tracking, artist, title
        ):
--- a/karaoke_downloader/song_validator.py
+++ b/karaoke_downloader/song_validator.py
@ -35,6 +35,7 @@ class SongValidator:
        video_title: Optional[str] = None,
        server_songs: Optional[Dict[str, Any]] = None,
        server_duplicates_tracking: Optional[Dict[str, Any]] = None,
+        force_download: bool = False,
    ) -> Tuple[bool, Optional[str], int]:
        """
        Check if a song should be skipped based on multiple criteria.
@ -53,10 +54,15 @@ class SongValidator:
            video_title: YouTube video title (optional)
            server_songs: Server songs data (optional)
            server_duplicates_tracking: Server duplicates tracking (optional)
+            force_download: If True, bypass all validation checks and force download

        Returns:
            Tuple of (should_skip, reason, total_filtered)
        """
+        # If force download is enabled, skip all validation checks
+        if force_download:
+            return False, None, 0
+
        total_filtered = 0

        # Check 1: Already downloaded by this system
--- a/karaoke_downloader/songlist_generator.py
+++ b/karaoke_downloader/songlist_generator.py
@ -0,0 +1,265 @@
+import json
+import os
+from pathlib import Path
+from typing import List, Dict, Any, Optional
+from mutagen.mp4 import MP4
+
+from karaoke_downloader.data_path_manager import get_data_path_manager
+
+
+class SongListGenerator:
+    """Utility class for generating song lists from MP4 files with ID3 tags."""
+    
+    def __init__(self, songlist_path: str = None):
+        if songlist_path is None:
+            songlist_path = str(get_data_path_manager().get_songlist_path())
+        self.songlist_path = Path(songlist_path)
+        self.songlist_path.parent.mkdir(parents=True, exist_ok=True)
+    
+    def read_existing_songlist(self) -> List[Dict[str, Any]]:
+        """Read existing song list from JSON file."""
+        if self.songlist_path.exists():
+            try:
+                with open(self.songlist_path, 'r', encoding='utf-8') as f:
+                    return json.load(f)
+            except (json.JSONDecodeError, IOError) as e:
+                print(f"⚠️ Warning: Could not read existing songlist: {e}")
+                return []
+        return []
+    
+    def save_songlist(self, songlist: List[Dict[str, Any]]) -> None:
+        """Save song list to JSON file."""
+        try:
+            with open(self.songlist_path, 'w', encoding='utf-8') as f:
+                json.dump(songlist, f, indent=2, ensure_ascii=False)
+            print(f"✅ Song list saved to {self.songlist_path}")
+        except IOError as e:
+            print(f"❌ Error saving song list: {e}")
+            raise
+    
+    def extract_id3_tags(self, mp4_path: Path) -> Optional[Dict[str, str]]:
+        """Extract ID3 tags from MP4 file."""
+        try:
+            mp4 = MP4(str(mp4_path))
+            
+            # Extract artist and title from ID3 tags
+            artist = mp4.get("\xa9ART", ["Unknown Artist"])[0] if "\xa9ART" in mp4 else "Unknown Artist"
+            title = mp4.get("\xa9nam", ["Unknown Title"])[0] if "\xa9nam" in mp4 else "Unknown Title"
+            
+            return {
+                "artist": artist,
+                "title": title
+            }
+        except Exception as e:
+            print(f"⚠️ Warning: Could not extract ID3 tags from {mp4_path.name}: {e}")
+            return None
+    
+    def scan_directory_for_mp4_files(self, directory_path: str) -> List[Path]:
+        """Scan directory for MP4 files."""
+        directory = Path(directory_path)
+        if not directory.exists():
+            raise FileNotFoundError(f"Directory not found: {directory_path}")
+        
+        if not directory.is_dir():
+            raise ValueError(f"Path is not a directory: {directory_path}")
+        
+        mp4_files = list(directory.glob("*.mp4"))
+        if not mp4_files:
+            print(f"⚠️ No MP4 files found in {directory_path}")
+            return []
+        
+        print(f"📁 Found {len(mp4_files)} MP4 files in {directory.name}")
+        return sorted(mp4_files)
+    
+    def generate_songlist_from_directory(self, directory_path: str, append: bool = True) -> Dict[str, Any]:
+        """Generate a song list from MP4 files in a directory."""
+        directory = Path(directory_path)
+        directory_name = directory.name
+        
+        # Scan for MP4 files
+        mp4_files = self.scan_directory_for_mp4_files(directory_path)
+        if not mp4_files:
+            return {}
+        
+        # Extract ID3 tags and create songs list
+        songs = []
+        for index, mp4_file in enumerate(mp4_files, start=1):
+            id3_tags = self.extract_id3_tags(mp4_file)
+            if id3_tags:
+                song = {
+                    "position": index,
+                    "title": id3_tags["title"],
+                    "artist": id3_tags["artist"]
+                }
+                songs.append(song)
+                print(f"  {index:3d}. {id3_tags['artist']} - {id3_tags['title']}")
+        
+        if not songs:
+            print("❌ No valid ID3 tags found in any MP4 files")
+            return {}
+        
+        # Create the song list entry
+        songlist_entry = {
+            "title": directory_name,
+            "songs": songs
+        }
+        
+        # Handle appending to existing song list
+        if append:
+            existing_songlist = self.read_existing_songlist()
+            
+            # Check if a playlist with this title already exists
+            existing_index = None
+            for i, entry in enumerate(existing_songlist):
+                if entry.get("title") == directory_name:
+                    existing_index = i
+                    break
+            
+            if existing_index is not None:
+                # Replace existing entry
+                print(f"🔄 Replacing existing playlist: {directory_name}")
+                existing_songlist[existing_index] = songlist_entry
+            else:
+                # Add new entry to the beginning of the list
+                print(f"➕ Adding new playlist: {directory_name}")
+                existing_songlist.insert(0, songlist_entry)
+            
+            self.save_songlist(existing_songlist)
+        else:
+            # Create new song list with just this entry
+            print(f"📝 Creating new song list with playlist: {directory_name}")
+            self.save_songlist([songlist_entry])
+        
+        return songlist_entry
+    
+    def generate_songlist_from_multiple_directories(self, directory_paths: List[str], append: bool = True) -> List[Dict[str, Any]]:
+        """Generate song lists from multiple directories."""
+        results = []
+        errors = []
+
+        # Read existing song list once at the beginning
+        existing_songlist = self.read_existing_songlist() if append else []
+        
+        for directory_path in directory_paths:
+            try:
+                print(f"\n📂 Processing directory: {directory_path}")
+                directory = Path(directory_path)
+                directory_name = directory.name
+
+                # Scan for MP4 files
+                mp4_files = self.scan_directory_for_mp4_files(directory_path)
+                if not mp4_files:
+                    continue
+
+                # Extract ID3 tags and create songs list
+                songs = []
+                for index, mp4_file in enumerate(mp4_files, start=1):
+                    id3_tags = self.extract_id3_tags(mp4_file)
+                    if id3_tags:
+                        song = {
+                            "position": index,
+                            "title": id3_tags["title"],
+                            "artist": id3_tags["artist"]
+                        }
+                        songs.append(song)
+                        print(f"  {index:3d}. {id3_tags['artist']} - {id3_tags['title']}")
+
+                if not songs:
+                    print("❌ No valid ID3 tags found in any MP4 files")
+                    continue
+
+                # Create the song list entry
+                songlist_entry = {
+                    "title": directory_name,
+                    "songs": songs
+                }
+
+                # Check if a playlist with this title already exists
+                existing_index = None
+                for i, entry in enumerate(existing_songlist):
+                    if entry.get("title") == directory_name:
+                        existing_index = i
+                        break
+
+                if existing_index is not None:
+                    # Replace existing entry
+                    print(f"🔄 Replacing existing playlist: {directory_name}")
+                    existing_songlist[existing_index] = songlist_entry
+                else:
+                    # Add new entry to the beginning of the list
+                    print(f"➕ Adding new playlist: {directory_name}")
+                    existing_songlist.insert(0, songlist_entry)
+
+                results.append(songlist_entry)
+
+            except Exception as e:
+                error_msg = f"Error processing {directory_path}: {e}"
+                print(f"❌ {error_msg}")
+                errors.append(error_msg)
+
+        # Save the final song list
+        if results:
+            if append:
+                # Save the updated existing song list
+                self.save_songlist(existing_songlist)
+            else:
+                # Create new song list with just the results
+                self.save_songlist(results)
+
+        # If there were any errors, raise an exception
+        if errors:
+            raise Exception(f"Failed to process {len(errors)} directories: {'; '.join(errors)}")
+
+        return results
+
+
+def main():
+    """CLI entry point for song list generation."""
+    import argparse
+    import sys
+    
+    parser = argparse.ArgumentParser(
+        description="Generate song lists from MP4 files with ID3 tags",
+        formatter_class=argparse.RawDescriptionHelpFormatter,
+        epilog="""
+Examples:
+  python -m karaoke_downloader.songlist_generator /path/to/mp4/directory
+  python -m karaoke_downloader.songlist_generator /path/to/dir1 /path/to/dir2 --no-append
+  python -m karaoke_downloader.songlist_generator /path/to/dir --songlist-path custom_songlist.json
+        """
+    )
+    
+    parser.add_argument(
+        "directories",
+        nargs="+",
+        help="Directory paths containing MP4 files with ID3 tags"
+    )
+    
+    parser.add_argument(
+        "--no-append",
+        action="store_true",
+        help="Create a new song list instead of appending to existing one"
+    )
+    
+    parser.add_argument(
+        "--songlist-path",
+        default=None,
+        help="Path to the song list JSON file (default: songList.json in the data directory)"
+    )
+    
+    args = parser.parse_args()
+    
+    try:
+        generator = SongListGenerator(args.songlist_path)
+        generator.generate_songlist_from_multiple_directories(
+            args.directories, 
+            append=not args.no_append
+        )
+        print("\n✅ Song list generation completed successfully!")
+    except Exception as e:
+        print(f"\n❌ Error: {e}")
+        sys.exit(1)
+
+
+if __name__ == "__main__":
+    main() 
--- a/karaoke_downloader/songlist_manager.py
+++ b/karaoke_downloader/songlist_manager.py
@ -7,6 +7,7 @@ import json
 from datetime import datetime
 from pathlib import Path

+from karaoke_downloader.data_path_manager import get_data_path_manager
 from karaoke_downloader.server_manager import (
    check_and_mark_server_duplicate,
    is_song_marked_as_server_duplicate,
@ -16,7 +17,9 @@ from karaoke_downloader.server_manager import (
 )


-def load_songlist(songlist_path="data/songList.json"):
+def load_songlist(songlist_path=None):
+    if songlist_path is None:
+        songlist_path = str(get_data_path_manager().get_songlist_path())
    songlist_file = Path(songlist_path)
    if not songlist_file.exists():
        print(f"⚠️ Songlist file not found: {songlist_path}")
@ -55,7 +58,9 @@ def normalize_title(title):
    return " ".join(normalized.split()).lower()


-def load_songlist_tracking(tracking_path="data/songlist_tracking.json"):
+def load_songlist_tracking(tracking_path=None):
+    if tracking_path is None:
+        tracking_path = str(get_data_path_manager().get_songlist_tracking_path())
    tracking_file = Path(tracking_path)
    if not tracking_file.exists():
        return {}
@ -67,7 +72,9 @@ def load_songlist_tracking(tracking_path="data/songlist_tracking.json"):
        return {}


-def save_songlist_tracking(tracking, tracking_path="data/songlist_tracking.json"):
+def save_songlist_tracking(tracking, tracking_path=None):
+    if tracking_path is None:
+        tracking_path = str(get_data_path_manager().get_songlist_tracking_path())
    try:
        with open(tracking_path, "w", encoding="utf-8") as f:
            json.dump(tracking, f, indent=2, ensure_ascii=False)
--- a/karaoke_downloader/tracking_manager.py
+++ b/karaoke_downloader/tracking_manager.py
@ -1,10 +1,12 @@
-import threading
-from enum import Enum
-
 import json
-from datetime import datetime
+import os
+import re
+from datetime import datetime, timedelta
+from enum import Enum
 from pathlib import Path
+from typing import Any, Dict, List, Optional, Tuple

+from karaoke_downloader.data_path_manager import get_data_path_manager

 class SongStatus(str, Enum):
    NOT_DOWNLOADED = "NOT_DOWNLOADED"
@ -25,46 +27,133 @@ class FormatType(str, Enum):
 class TrackingManager:
    def __init__(
        self,
-        tracking_file="data/karaoke_tracking.json",
-        cache_file="data/channel_cache.json",
+        tracking_file=None,
+        cache_dir=None,
    ):
+        if tracking_file is None:
+            tracking_file = str(get_data_path_manager().get_karaoke_tracking_path())
+        if cache_dir is None:
+            cache_dir = str(get_data_path_manager().get_channel_cache_dir())
+        
        self.tracking_file = Path(tracking_file)
-        self.cache_file = Path(cache_file)
-        self.data = {"playlists": {}, "songs": {}}
-        self.cache = {}
-        self._lock = threading.Lock()
-        self._load()
-        self._load_cache()
+        self.cache_dir = Path(cache_dir)
+        
+        # Ensure cache directory exists
+        self.cache_dir.mkdir(parents=True, exist_ok=True)
+        
+        self.data = self._load()
+        print(f"📊 Tracking manager initialized with {len(self.data.get('songs', {}))} tracked songs")

    def _load(self):
+        """Load tracking data from JSON file."""
        if self.tracking_file.exists():
            try:
                with open(self.tracking_file, "r", encoding="utf-8") as f:
-                    self.data = json.load(f)
-            except Exception:
-                self.data = {"playlists": {}, "songs": {}}
+                    return json.load(f)
+            except json.JSONDecodeError:
+                print(f"⚠️  Corrupted tracking file, creating new one")
+        
+        return {"songs": {}, "playlists": {}, "last_updated": datetime.now().isoformat()}

    def _save(self):
-        with self._lock:
-            with open(self.tracking_file, "w", encoding="utf-8") as f:
-                json.dump(self.data, f, indent=2, ensure_ascii=False)
+        """Save tracking data to JSON file."""
+        self.data["last_updated"] = datetime.now().isoformat()
+        self.tracking_file.parent.mkdir(parents=True, exist_ok=True)
+        with open(self.tracking_file, "w", encoding="utf-8") as f:
+            json.dump(self.data, f, indent=2, ensure_ascii=False)

    def force_save(self):
+        """Force save the tracking data."""
        self._save()

-    def _load_cache(self):
-        if self.cache_file.exists():
-            try:
-                with open(self.cache_file, "r", encoding="utf-8") as f:
-                    self.cache = json.load(f)
-            except Exception:
-                self.cache = {}
+    def _get_channel_cache_file(self, channel_id: str) -> Path:
+        """Get the cache file path for a specific channel."""
+        # Sanitize channel ID for filename
+        safe_channel_id = re.sub(r'[<>:"/\\|?*]', '_', channel_id)
+        return self.cache_dir / f"{safe_channel_id}.json"

-    def save_cache(self):
-        with open(self.cache_file, "w", encoding="utf-8") as f:
-            json.dump(self.cache, f, indent=2, ensure_ascii=False)
+    def _load_channel_cache(self, channel_id: str) -> List[Dict[str, str]]:
+        """Load cache for a specific channel."""
+        cache_file = self._get_channel_cache_file(channel_id)
+        if cache_file.exists():
+            try:
+                with open(cache_file, 'r', encoding='utf-8') as f:
+                    data = json.load(f)
+                    return data.get('videos', [])
+            except (json.JSONDecodeError, KeyError):
+                print(f"   ⚠️  Corrupted cache file for {channel_id}, will recreate")
+                return []
+        return []
+
+    def _save_channel_cache(self, channel_id: str, videos: List[Dict[str, str]]):
+        """Save cache for a specific channel."""
+        cache_file = self._get_channel_cache_file(channel_id)
+        data = {
+            'channel_id': channel_id,
+            'videos': videos,
+            'last_updated': datetime.now().isoformat(),
+            'video_count': len(videos)
+        }
+        with open(cache_file, 'w', encoding='utf-8') as f:
+            json.dump(data, f, indent=2, ensure_ascii=False)
+
+    def _clear_channel_cache(self, channel_id: str):
+        """Clear cache for a specific channel."""
+        cache_file = self._get_channel_cache_file(channel_id)
+        if cache_file.exists():
+            cache_file.unlink()
+            print(f"   🗑️  Cleared cache file: {cache_file.name}")
+
+    def get_cache_info(self):
+        """Get information about all channel cache files."""
+        cache_files = list(self.cache_dir.glob("*.json"))
+        total_videos = 0
+        cache_info = []
+        
+        for cache_file in cache_files:
+            try:
+                with open(cache_file, 'r', encoding='utf-8') as f:
+                    data = json.load(f)
+                    video_count = len(data.get('videos', []))
+                    total_videos += video_count
+                    last_updated = data.get('last_updated', 'Unknown')
+                    cache_info.append({
+                        'channel': data.get('channel_id', cache_file.stem),
+                        'videos': video_count,
+                        'last_updated': last_updated,
+                        'file': cache_file.name
+                    })
+            except Exception as e:
+                print(f"⚠️  Error reading cache file {cache_file.name}: {e}")
+        
+        return {
+            'total_channels': len(cache_files),
+            'total_videos': total_videos,
+            'channels': cache_info
+        }
+
+    def clear_channel_cache(self, channel_id=None):
+        """Clear cache for a specific channel or all channels."""
+        if channel_id:
+            self._clear_channel_cache(channel_id)
+            print(f"🗑️  Cleared cache for channel: {channel_id}")
+        else:
+            # Clear all cache files
+            cache_files = list(self.cache_dir.glob("*.json"))
+            for cache_file in cache_files:
+                cache_file.unlink()
+            print(f"🗑️  Cleared all {len(cache_files)} channel cache files")
+
+    def set_cache_duration(self, hours):
+        """Placeholder for cache duration logic"""
+        pass
+
+    def export_playlist_report(self, playlist_id):
+        """Export a report for a specific playlist."""
+        pass

    def get_statistics(self):
+        """Get statistics about tracked songs."""
        total_songs = len(self.data["songs"])
        downloaded_songs = sum(
            1
@ -102,11 +191,13 @@ class TrackingManager:
        }

    def get_playlist_songs(self, playlist_id):
+        """Get songs for a specific playlist."""
        return [
            s for s in self.data["songs"].values() if s["playlist_id"] == playlist_id
        ]

    def get_failed_songs(self, playlist_id=None):
+        """Get failed songs, optionally filtered by playlist."""
        if playlist_id:
            return [
                s
@ -118,6 +209,7 @@ class TrackingManager:
        ]

    def get_partial_downloads(self, playlist_id=None):
+        """Get partial downloads, optionally filtered by playlist."""
        if playlist_id:
            return [
                s
@ -129,7 +221,7 @@ class TrackingManager:
        ]

    def cleanup_orphaned_files(self, downloads_dir):
-        # Remove tracking entries for files that no longer exist
+        """Remove tracking entries for files that no longer exist."""
        orphaned = []
        for song_id, song in list(self.data["songs"].items()):
            file_path = song.get("file_path")
@ -139,51 +231,17 @@ class TrackingManager:
        self.force_save()
        return orphaned

-    def get_cache_info(self):
-        total_channels = len(self.cache)
-        total_cached_videos = sum(len(v) for v in self.cache.values())
-        cache_duration_hours = 24  # default
-        last_updated = None
-        return {
-            "total_channels": total_channels,
-            "total_cached_videos": total_cached_videos,
-            "cache_duration_hours": cache_duration_hours,
-            "last_updated": last_updated,
-        }
-
-    def clear_channel_cache(self, channel_id=None):
-        if channel_id is None or channel_id == "all":
-            self.cache = {}
-        else:
-            self.cache.pop(channel_id, None)
-        self.save_cache()
-
-    def set_cache_duration(self, hours):
-        # Placeholder for cache duration logic
-        pass
-
-    def export_playlist_report(self, playlist_id):
-        playlist = self.data["playlists"].get(playlist_id)
-        if not playlist:
-            return f"Playlist '{playlist_id}' not found."
-        songs = self.get_playlist_songs(playlist_id)
-        report = {"playlist": playlist, "songs": songs}
-        return json.dumps(report, indent=2, ensure_ascii=False)
-
    def is_song_downloaded(self, artist, title, channel_name=None, video_id=None):
        """
-        Check if a song has already been downloaded by this system.
-        Returns True if the song exists in tracking with DOWNLOADED or CONVERTED status.
+        Check if a song has already been downloaded.
+        Returns True if the song exists in tracking with DOWNLOADED status.
        """
        # If we have video_id and channel_name, try direct key lookup first (most efficient)
        if video_id and channel_name:
            song_key = f"{video_id}@{channel_name}"
            if song_key in self.data["songs"]:
                song_data = self.data["songs"][song_key]
-                if song_data.get("status") in [
-                    SongStatus.DOWNLOADED,
-                    SongStatus.CONVERTED,
-                ]:
+                if song_data.get("status") == SongStatus.DOWNLOADED:
                    return True

        # Fallback to content search (for cases where we don't have video_id)
@ -191,19 +249,14 @@ class TrackingManager:
            # Check if this song matches the artist and title
            if song_data.get("artist") == artist and song_data.get("title") == title:
                # Check if it's marked as downloaded
-                if song_data.get("status") in [
-                    SongStatus.DOWNLOADED,
-                    SongStatus.CONVERTED,
-                ]:
+                if song_data.get("status") == SongStatus.DOWNLOADED:
                    return True
            # Also check the video title field which might contain the song info
            video_title = song_data.get("video_title", "")
            if video_title and artist in video_title and title in video_title:
-                if song_data.get("status") in [
-                    SongStatus.DOWNLOADED,
-                    SongStatus.CONVERTED,
-                ]:
+                if song_data.get("status") == SongStatus.DOWNLOADED:
                    return True
+
        return False

    def is_file_exists(self, file_path):
@ -283,65 +336,359 @@ class TrackingManager:
        self._save()

    def get_channel_video_list(
-        self, channel_url, yt_dlp_path="downloader/yt-dlp.exe", force_refresh=False
+        self, channel_url, yt_dlp_path="downloader/yt-dlp.exe", force_refresh=False, show_pagination=False
    ):
        """
        Return a list of videos (dicts with 'title' and 'id') for the channel, using cache if available unless force_refresh is True.
+        
+        Args:
+            channel_url: YouTube channel URL
+            yt_dlp_path: Path to yt-dlp executable
+            force_refresh: Force refresh cache even if available
+            show_pagination: Show page-by-page progress (slower but more detailed)
        """
        channel_name, channel_id = None, None
+        
+        # Check if this is a manual channel
+        from karaoke_downloader.manual_video_manager import is_manual_channel, get_manual_channel_info, get_manual_videos_for_channel
+        
+        if is_manual_channel(channel_url):
+            channel_name, channel_id = get_manual_channel_info(channel_url)
+            if channel_name and channel_id:
+                print(f"   📋 Loading manual videos for {channel_name}")
+                manual_videos = get_manual_videos_for_channel(channel_name)
+                # Convert to the expected format
+                videos = []
+                for video in manual_videos:
+                    videos.append({
+                        "title": video.get("title", ""),
+                        "id": video.get("id", ""),
+                        "url": video.get("url", "")
+                    })
+                print(f"   ✅ Loaded {len(videos)} manual videos")
+                return videos
+            else:
+                print(f"   ❌ Could not get manual channel info for: {channel_url}")
+                return []
+        
+        # Regular YouTube channel processing
        from karaoke_downloader.youtube_utils import get_channel_info
-
        channel_name, channel_id = get_channel_info(channel_url)
+        
+        if not channel_id:
+            print(f"   ❌ Could not extract channel ID from URL: {channel_url}")
+            return []

-        # Try multiple possible cache keys
-        possible_keys = [
-            channel_id,  # The extracted channel ID
-            channel_url,  # The full URL
-            channel_name,  # The extracted channel name
-        ]
+        print(f"   🔍 Channel: {channel_name} (ID: {channel_id})")

-        cache_key = None
-        for key in possible_keys:
-            if key and key in self.cache:
-                cache_key = key
-                break
+        # Check if we have cached data for this channel
+        if not force_refresh:
+            cached_videos = self._load_channel_cache(channel_id)
+            if cached_videos:
+                # Validate that the cached data has proper video IDs
+                corrupted = False
+                
+                # Check if any video IDs look like titles instead of proper YouTube IDs
+                for video in cached_videos[:20]:  # Check first 20 videos
+                    video_id = video.get("id", "")
+                    # More comprehensive validation - YouTube IDs should be 11 characters and contain only alphanumeric, hyphens, and underscores
+                    if video_id and (
+                        len(video_id) != 11 or 
+                        not video_id.replace('-', '').replace('_', '').isalnum() or
+                        " " in video_id or 
+                        "Lyrics" in video_id or
+                        "KARAOKE" in video_id.upper() or
+                        "Vocal" in video_id or
+                        "Guide" in video_id
+                    ):
+                        print(f"   ⚠️  Detected corrupted video ID in cache: '{video_id}'")
+                        corrupted = True
+                        break
+                
+                if corrupted:
+                    print(f"   🧹 Clearing corrupted cache for {channel_id}")
+                    self._clear_channel_cache(channel_id)
+                    force_refresh = True
+                else:
+                    print(f"   📋 Using cached video list ({len(cached_videos)} videos)")
+                    return cached_videos

-        if not cache_key:
-            cache_key = channel_id or channel_url  # Use as fallback for new entries
-
-        print(f"   🔍 Trying cache keys: {possible_keys}")
-        print(f"   🔍 Selected cache key: '{cache_key}'")
-
-        if not force_refresh and cache_key in self.cache:
-            print(
-                f"   📋 Using cached video list ({len(self.cache[cache_key])} videos)"
-            )
-            return self.cache[cache_key]
+        # Choose fetch method based on show_pagination flag
+        if show_pagination:
+            return self._fetch_videos_with_pagination(channel_url, channel_id, yt_dlp_path)
        else:
-            print(f"   ❌ Cache miss for all keys")
+            return self._fetch_videos_flat_playlist(channel_url, channel_id, yt_dlp_path)
+
+    def _fetch_videos_with_pagination(self, channel_url, channel_id, yt_dlp_path):
+        """Fetch videos showing page-by-page progress."""
+        print(f"   🌐 Fetching video list from YouTube (page-by-page mode)...")
+        print(f"   📡 Channel URL: {channel_url}")
+        
+        import subprocess
+        
+        all_videos = []
+        page = 1
+        videos_per_page = 200  # YouTube/yt-dlp supports up to 200 videos per page, reducing API calls and errors
+        
+        while True:
+            print(f"   📄 Fetching page {page}...")
+            
+            # Fetch one page at a time
+            cmd = [
+                yt_dlp_path,
+                "--flat-playlist",
+                "--print",
+                "%(title)s|%(id)s|%(url)s",
+                "--playlist-start",
+                str((page - 1) * videos_per_page + 1),
+                "--playlist-end",
+                str(page * videos_per_page),
+                channel_url,
+            ]
+            
+            try:
+                # Increased timeout to 180 seconds for larger pages (200 videos)
+                result = subprocess.run(cmd, capture_output=True, text=True, check=True, timeout=180)
+                lines = result.stdout.strip().splitlines()
+                
+                # Save raw output for debugging (for each page)
+                raw_output_file = self._get_channel_cache_file(channel_id).parent / f"{channel_id}_raw_output_page{page}.txt"
+                try:
+                    with open(raw_output_file, 'w', encoding='utf-8') as f:
+                        f.write(f"# Raw yt-dlp output for {channel_id} - Page {page}\n")
+                        f.write(f"# Channel URL: {channel_url}\n")
+                        f.write(f"# Command: {' '.join(cmd)}\n")
+                        f.write(f"# Timestamp: {datetime.now().isoformat()}\n")
+                        f.write(f"# Total lines: {len(lines)}\n")
+                        f.write("#" * 80 + "\n\n")
+                        for i, line in enumerate(lines, 1):
+                            f.write(f"{i:6d}: {line}\n")
+                    print(f"   💾 Saved raw output to: {raw_output_file.name}")
+                except Exception as e:
+                    print(f"   ⚠️  Could not save raw output: {e}")
+                
+                if not lines:
+                    print(f"   ✅ No more videos found on page {page}")
+                    break
+                
+                print(f"   📊 Page {page}: Found {len(lines)} videos")
+                
+                page_videos = []
+                invalid_count = 0
+                
+                for line in lines:
+                    if not line.strip():
+                        continue
+                    
+                    # More robust parsing that handles titles with | characters
+                    # Extract video ID directly from the URL that yt-dlp provides
+                    
+                    # Find the URL and extract video ID from it
+                    url_match = re.search(r'https://www\.youtube\.com/watch\?v=([a-zA-Z0-9_-]{11})', line)
+                    if not url_match:
+                        continue
+                    
+                    # Extract video ID directly from the URL
+                    video_id = url_match.group(1)
+                    
+                    # Extract title (everything before the video ID in the line)
+                    title = line[:line.find(video_id)].rstrip('|').strip()
+                    
+                    # Validate video ID
+                    if video_id and (
+                        len(video_id) == 11 and 
+                        video_id.replace('-', '').replace('_', '').isalnum() and
+                        " " not in video_id and 
+                        "Lyrics" not in video_id and
+                        "KARAOKE" not in video_id.upper() and
+                        "Vocal" not in video_id and
+                        "Guide" not in video_id
+                    ):
+                        page_videos.append({"title": title, "id": video_id})
+                    else:
+                        invalid_count += 1
+                        if invalid_count <= 3:  # Show first 3 invalid IDs per page
+                            print(f"      ⚠️  Invalid ID: '{video_id}' for '{title[:50]}...'")
+                
+                if invalid_count > 3:
+                    print(f"      ⚠️  ... and {invalid_count - 3} more invalid IDs on this page")
+                
+                all_videos.extend(page_videos)
+                print(f"   ✅ Page {page}: Added {len(page_videos)} valid videos (total: {len(all_videos)})")
+                
+                # If we got fewer videos than expected, we're probably at the end
+                if len(lines) < videos_per_page:
+                    print(f"   🏁 Reached end of channel (last page had {len(lines)} videos)")
+                    break
+                
+                page += 1
+                
+                # Safety check to prevent infinite loops
+                if page > 50:  # Max 50 pages (10,000 videos with 200 per page)
+                    print(f"   ⚠️  Reached maximum page limit (50 pages), stopping")
+                    break
+                    
+            except subprocess.TimeoutExpired:
+                print(f"   ⚠️  Page {page} timed out, stopping")
+                break
+            except subprocess.CalledProcessError as e:
+                print(f"   ❌ Error fetching page {page}: {e}")
+                break
+            except KeyboardInterrupt:
+                print(f"   ⏹️  User interrupted, stopping at page {page}")
+                break
+        
+        if not all_videos:
+            print(f"   ❌ No valid videos found")
+            return []
+        
+        print(f"   🎉 Channel download complete!")
+        print(f"   📊 Total videos fetched: {len(all_videos)}")
+        
+        # Save to individual channel cache file
+        self._save_channel_cache(channel_id, all_videos)
+        print(f"   💾 Saved cache to: {self._get_channel_cache_file(channel_id).name}")
+        
+        return all_videos
+
+    def _fetch_videos_flat_playlist(self, channel_url, channel_id, yt_dlp_path):
+        """Fetch all videos using flat playlist (faster but less detailed progress)."""
        # Fetch with yt-dlp
        print(f"   🌐 Fetching video list from YouTube (this may take a while)...")
+        print(f"   📡 Channel URL: {channel_url}")
+        
        import subprocess
+        from karaoke_downloader.youtube_utils import _parse_yt_dlp_command

-        cmd = [
-            yt_dlp_path,
+        # First, let's get the total count to show progress
+        count_cmd = _parse_yt_dlp_command(yt_dlp_path) + [
+            "--flat-playlist",
+            "--print",
+            "%(title)s",
+            "--playlist-end",
+            "1",  # Just get first video to test
+            channel_url,
+        ]
+        
+        try:
+            print(f"   🔍 Testing channel access...")
+            test_result = subprocess.run(count_cmd, capture_output=True, text=True, timeout=30)
+            if test_result.returncode == 0:
+                print(f"   ✅ Channel is accessible")
+            else:
+                print(f"   ⚠️  Channel test failed: {test_result.stderr}")
+        except subprocess.TimeoutExpired:
+            print(f"   ⚠️  Channel test timed out")
+        except Exception as e:
+            print(f"   ⚠️  Channel test error: {e}")
+
+        # Now fetch all videos with progress indicators
+        cmd = _parse_yt_dlp_command(yt_dlp_path) + [
            "--flat-playlist",
            "--print",
            "%(title)s|%(id)s|%(url)s",
+            "--verbose",  # Add verbose output to see what's happening
            channel_url,
        ]
+        
        try:
-            result = subprocess.run(cmd, capture_output=True, text=True, check=True)
+            print(f"   🔧 Running yt-dlp command: {' '.join(cmd)}")
+            print(f"   📥 Starting video list download...")
+            
+            # Use a timeout and show progress
+            result = subprocess.run(cmd, capture_output=True, text=True, check=True, timeout=300)
            lines = result.stdout.strip().splitlines()
+            
+            # Save raw output for debugging
+            raw_output_file = self._get_channel_cache_file(channel_id).parent / f"{channel_id}_raw_output.txt"
+            try:
+                with open(raw_output_file, 'w', encoding='utf-8') as f:
+                    f.write(f"# Raw yt-dlp output for {channel_id}\n")
+                    f.write(f"# Channel URL: {channel_url}\n")
+                    f.write(f"# Command: {' '.join(cmd)}\n")
+                    f.write(f"# Timestamp: {datetime.now().isoformat()}\n")
+                    f.write(f"# Total lines: {len(lines)}\n")
+                    f.write("#" * 80 + "\n\n")
+                    for i, line in enumerate(lines, 1):
+                        f.write(f"{i:6d}: {line}\n")
+                print(f"   💾 Saved raw output to: {raw_output_file.name}")
+            except Exception as e:
+                print(f"   ⚠️  Could not save raw output: {e}")
+            
+            print(f"   📄 Raw output lines: {len(lines)}")
+            print(f"   📊 Download completed successfully!")
+            
+            # Show some sample lines to understand the format
+            if lines:
+                print(f"   📋 Sample output format:")
+                for i, line in enumerate(lines[:3]):
+                    print(f"      Line {i+1}: {line[:100]}...")
+                if len(lines) > 3:
+                    print(f"      ... and {len(lines) - 3} more lines")
+            
            videos = []
-            for line in lines:
-                parts = line.split("|")
-                if len(parts) >= 2:
-                    title, video_id = parts[0].strip(), parts[1].strip()
+            invalid_count = 0
+            
+            print(f"   🔍 Processing {len(lines)} video entries...")
+            
+            for i, line in enumerate(lines):
+                if i % 1000 == 0 and i > 0:  # Progress indicator every 1000 lines
+                    print(f"   📊 Processing line {i}/{len(lines)}... ({i/len(lines)*100:.1f}%)")
+                
+                # More robust parsing that handles titles with | characters
+                # Extract video ID directly from the URL that yt-dlp provides
+                
+                # Find the URL and extract video ID from it
+                url_match = re.search(r'https://www\.youtube\.com/watch\?v=([a-zA-Z0-9_-]{11})', line)
+                if not url_match:
+                    invalid_count += 1
+                    if invalid_count <= 5:
+                        print(f"   ⚠️  Skipping line with no URL: '{line[:100]}...'")
+                    elif invalid_count == 6:
+                        print(f"   ⚠️  ... and {len(lines) - i - 1} more invalid lines")
+                    continue
+                
+                # Extract video ID directly from the URL
+                video_id = url_match.group(1)
+                
+                # Extract title (everything before the video ID in the line)
+                title = line[:line.find(video_id)].rstrip('|').strip()
+                
+                # Validate video ID
+                if video_id and (
+                    len(video_id) == 11 and 
+                    video_id.replace('-', '').replace('_', '').isalnum() and
+                    " " not in video_id and 
+                    "Lyrics" not in video_id and
+                    "KARAOKE" not in video_id.upper() and
+                    "Vocal" not in video_id and
+                    "Guide" not in video_id
+                ):
                    videos.append({"title": title, "id": video_id})
-            self.cache[cache_key] = videos
-            self.save_cache()
+                else:
+                    invalid_count += 1
+                    if invalid_count <= 5:  # Only show first 5 invalid IDs
+                        print(f"   ⚠️  Skipping invalid video ID: '{video_id}' for title: '{title[:50]}...'")
+                    elif invalid_count == 6:
+                        print(f"   ⚠️  ... and {len(lines) - i - 1} more invalid IDs")
+            
+            if not videos:
+                print(f"   ❌ No valid videos found after parsing")
+                return []
+                
+            print(f"   ✅ Parsed {len(videos)} valid videos from YouTube")
+            print(f"   ⚠️  Skipped {invalid_count} invalid video IDs")
+            
+            # Save to individual channel cache file
+            self._save_channel_cache(channel_id, videos)
+            print(f"   💾 Saved cache to: {self._get_channel_cache_file(channel_id).name}")
+            
            return videos
+            
+        except subprocess.TimeoutExpired:
+            print(f"❌ yt-dlp timed out after 5 minutes - channel may be too large")
+            return []
        except subprocess.CalledProcessError as e:
            print(f"❌ yt-dlp failed to fetch playlist for cache: {e}")
+            print(f"   📄 stderr: {e.stderr}")
            return []
--- a/karaoke_downloader/video_downloader.py
+++ b/karaoke_downloader/video_downloader.py
@ -106,6 +106,10 @@ def download_single_video(
    print(f"⬇️  Downloading: {artist} - {title} -> {output_path}")

    video_url = f"https://www.youtube.com/watch?v={video_id}"
+    
+    # Debug: Show the video_id and URL being used
+    print(f"🔍 DEBUG: video_id = '{video_id}'")
+    print(f"🔍 DEBUG: video_url = '{video_url}'")

    # Build command using centralized utility
    cmd = build_yt_dlp_command(yt_dlp_path, video_url, output_path, config)
@ -255,7 +259,7 @@ def execute_download_plan(
        video_id = item["video_id"]
        video_title = item["video_title"]

-        print(f"\n⬇️  Downloading {len(download_plan) - idx} of {total_to_download}:")
+        print(f"\n⬇️  Downloading {downloaded_count + 1} of {total_to_download}:")
        print(f"   📋 Songlist: {artist} - {title}")
        print(f"   🎬 Video: {video_title} ({channel_name})")
        if "match_score" in item:
--- a/karaoke_downloader/youtube_utils.py
+++ b/karaoke_downloader/youtube_utils.py
@ -9,6 +9,19 @@ from typing import Any, Dict, List, Optional, Union
 from karaoke_downloader.config_manager import AppConfig


+def _parse_yt_dlp_command(yt_dlp_path: str) -> List[str]:
+    """
+    Parse yt-dlp path/command into a list of command arguments.
+    Handles both file paths and command strings like 'python3 -m yt_dlp'.
+    """
+    if yt_dlp_path.startswith(('python', 'python3')):
+        # It's a Python module command
+        return yt_dlp_path.split()
+    else:
+        # It's a file path
+        return [yt_dlp_path]
+
+
 def get_channel_info(
    channel_url: str, yt_dlp_path: str = "downloader/yt-dlp.exe"
 ) -> tuple[str, str]:
@ -43,7 +56,7 @@ def get_playlist_info(
 ) -> List[Dict[str, Any]]:
    """Get playlist information using yt-dlp."""
    try:
-        cmd = [yt_dlp_path, "--dump-json", "--flat-playlist", playlist_url]
+        cmd = _parse_yt_dlp_command(yt_dlp_path) + ["--dump-json", "--flat-playlist", playlist_url]
        result = subprocess.run(cmd, capture_output=True, text=True, check=True)
        videos = []
        for line in result.stdout.strip().split("\n"):
@ -75,8 +88,7 @@ def build_yt_dlp_command(
    Returns:
        List of command arguments for subprocess.run
    """
-    cmd = [
-        str(yt_dlp_path),
+    cmd = _parse_yt_dlp_command(yt_dlp_path) + [
        "--no-check-certificates",
        "--ignore-errors",
        "--no-warnings",
@ -128,7 +140,7 @@ def show_available_formats(
        timeout: Timeout in seconds
    """
    print(f"🔍 Checking available formats for: {video_url}")
-    format_cmd = [str(yt_dlp_path), "--list-formats", video_url]
+    format_cmd = _parse_yt_dlp_command(yt_dlp_path) + ["--list-formats", video_url]
    try:
        format_result = subprocess.run(
            format_cmd, capture_output=True, text=True, timeout=timeout
--- a/setup_macos.py
+++ b/setup_macos.py
@ -0,0 +1,220 @@
+#!/usr/bin/env python3
+"""
+macOS setup script for Karaoke Video Downloader.
+This script helps users set up yt-dlp and FFmpeg on macOS.
+"""
+
+import os
+import sys
+import subprocess
+from pathlib import Path
+
+
+def check_ffmpeg():
+    """Check if FFmpeg is installed."""
+    try:
+        result = subprocess.run(["ffmpeg", "-version"], capture_output=True, text=True, timeout=10)
+        return result.returncode == 0
+    except (subprocess.TimeoutExpired, FileNotFoundError):
+        return False
+
+
+def check_yt_dlp():
+    """Check if yt-dlp is installed via pip or binary."""
+    # Check pip installation
+    try:
+        result = subprocess.run([sys.executable, "-m", "yt_dlp", "--version"], 
+                              capture_output=True, text=True, timeout=10)
+        if result.returncode == 0:
+            return True
+    except (subprocess.TimeoutExpired, subprocess.CalledProcessError):
+        pass
+    
+    # Check binary file
+    binary_path = Path("downloader/yt-dlp_macos")
+    if binary_path.exists():
+        try:
+            result = subprocess.run([str(binary_path), "--version"], 
+                                  capture_output=True, text=True, timeout=10)
+            return result.returncode == 0
+        except (subprocess.TimeoutExpired, subprocess.CalledProcessError):
+            pass
+    
+    return False
+
+
+def install_ffmpeg():
+    """Install FFmpeg via Homebrew."""
+    print("🎬 Installing FFmpeg...")
+    
+    # Check if Homebrew is installed
+    try:
+        subprocess.run(["brew", "--version"], capture_output=True, check=True)
+    except (subprocess.CalledProcessError, FileNotFoundError):
+        print("❌ Homebrew is not installed. Please install Homebrew first:")
+        print("   /bin/bash -c \"$(curl -fsSL https://raw.githubusercontent.com/Homebrew/install/HEAD/install.sh)\"")
+        return False
+    
+    try:
+        print("🍺 Installing FFmpeg via Homebrew...")
+        result = subprocess.run(["brew", "install", "ffmpeg"], 
+                              capture_output=True, text=True, check=True)
+        print("✅ FFmpeg installed successfully!")
+        return True
+    except subprocess.CalledProcessError as e:
+        print(f"❌ Failed to install FFmpeg: {e}")
+        return False
+
+
+def download_yt_dlp_binary():
+    """Download yt-dlp binary for macOS."""
+    print("📥 Downloading yt-dlp binary for macOS...")
+    
+    # Create downloader directory if it doesn't exist
+    downloader_dir = Path("downloader")
+    downloader_dir.mkdir(exist_ok=True)
+    
+    # Download yt-dlp binary
+    binary_path = downloader_dir / "yt-dlp_macos"
+    url = "https://github.com/yt-dlp/yt-dlp/releases/latest/download/yt-dlp_macos"
+    
+    try:
+        print(f"📡 Downloading from: {url}")
+        result = subprocess.run(["curl", "-L", "-o", str(binary_path), url], 
+                              capture_output=True, text=True, check=True)
+        
+        # Make it executable
+        binary_path.chmod(0o755)
+        print(f"✅ yt-dlp binary downloaded to: {binary_path}")
+        
+        # Test the binary
+        test_result = subprocess.run([str(binary_path), "--version"], 
+                                   capture_output=True, text=True, timeout=10)
+        if test_result.returncode == 0:
+            version = test_result.stdout.strip()
+            print(f"✅ Binary test successful! Version: {version}")
+            return True
+        else:
+            print(f"❌ Binary test failed: {test_result.stderr}")
+            return False
+            
+    except subprocess.CalledProcessError as e:
+        print(f"❌ Failed to download yt-dlp binary: {e}")
+        return False
+    except Exception as e:
+        print(f"❌ Error downloading binary: {e}")
+        return False
+
+
+def install_yt_dlp():
+    """Install yt-dlp via pip."""
+    print("📦 Installing yt-dlp...")
+    
+    try:
+        result = subprocess.run([sys.executable, "-m", "pip", "install", "yt-dlp"], 
+                              capture_output=True, text=True, check=True)
+        print("✅ yt-dlp installed successfully!")
+        return True
+    except subprocess.CalledProcessError as e:
+        print(f"❌ Failed to install yt-dlp: {e}")
+        return False
+
+
+def test_installation():
+    """Test the installation."""
+    print("\n🧪 Testing installation...")
+    
+    # Test FFmpeg
+    if check_ffmpeg():
+        print("✅ FFmpeg is working!")
+    else:
+        print("❌ FFmpeg is not working")
+        return False
+    
+    # Test yt-dlp
+    if check_yt_dlp():
+        print("✅ yt-dlp is working!")
+    else:
+        print("❌ yt-dlp is not working")
+        return False
+    
+    return True
+
+
+def main():
+    print("🍎 macOS Setup for Karaoke Video Downloader")
+    print("=" * 50)
+    
+    # Check current status
+    print("🔍 Checking current installation...")
+    ffmpeg_installed = check_ffmpeg()
+    yt_dlp_installed = check_yt_dlp()
+    
+    print(f"FFmpeg: {'✅ Installed' if ffmpeg_installed else '❌ Not installed'}")
+    print(f"yt-dlp: {'✅ Installed' if yt_dlp_installed else '❌ Not installed'}")
+    
+    if ffmpeg_installed and yt_dlp_installed:
+        print("\n🎉 Everything is already installed and working!")
+        return
+    
+    # Install missing components
+    print("\n🚀 Installing missing components...")
+    
+    # Install FFmpeg if needed
+    if not ffmpeg_installed:
+        print("\n🎬 FFmpeg Installation Options:")
+        print("1. Install via Homebrew (recommended)")
+        print("2. Download from ffmpeg.org")
+        print("3. Skip FFmpeg installation")
+        
+        choice = input("\nChoose an option (1-3): ").strip()
+        
+        if choice == "1":
+            if not install_ffmpeg():
+                print("❌ FFmpeg installation failed")
+                return
+        elif choice == "2":
+            print("📥 Please download FFmpeg from: https://ffmpeg.org/download.html")
+            print("   Extract and add to your PATH, then run this script again.")
+            return
+        elif choice == "3":
+            print("⚠️ FFmpeg is required for video processing. Some features may not work.")
+        else:
+            print("❌ Invalid choice")
+            return
+    
+    # Install yt-dlp if needed
+    if not yt_dlp_installed:
+        print("\n📦 yt-dlp Installation Options:")
+        print("1. Install via pip (recommended)")
+        print("2. Download binary file")
+        print("3. Skip yt-dlp installation")
+        
+        choice = input("\nChoose an option (1-3): ").strip()
+        
+        if choice == "1":
+            if not install_yt_dlp():
+                print("❌ yt-dlp installation failed")
+                return
+        elif choice == "2":
+            if not download_yt_dlp_binary():
+                print("❌ yt-dlp binary download failed")
+                return
+        elif choice == "3":
+            print("❌ yt-dlp is required for video downloading.")
+            return
+        else:
+            print("❌ Invalid choice")
+            return
+    
+    # Test installation
+    if test_installation():
+        print("\n🎉 Setup completed successfully!")
+        print("You can now use the Karaoke Video Downloader on macOS.")
+        print("Run: python download_karaoke.py --help")
+    else:
+        print("\n❌ Setup failed. Please check the error messages above.")
+
+
+if __name__ == "__main__":
+    main() 
--- a/utilities/add_manual_video.py
+++ b/utilities/add_manual_video.py
@ -0,0 +1,198 @@
+#!/usr/bin/env python3
+"""
+Helper script to add manual videos to the manual videos collection.
+"""
+
+import json
+import re
+from pathlib import Path
+from typing import Dict, List, Optional
+
+from karaoke_downloader.data_path_manager import get_data_path_manager
+
+def extract_video_id(url: str) -> Optional[str]:
+    """Extract video ID from YouTube URL."""
+    patterns = [
+        r'(?:youtube\.com/watch\?v=|youtu\.be/|youtube\.com/embed/)([a-zA-Z0-9_-]{11})',
+        r'youtube\.com/watch\?.*v=([a-zA-Z0-9_-]{11})'
+    ]
+    
+    for pattern in patterns:
+        match = re.search(pattern, url)
+        if match:
+            return match.group(1)
+    return None
+
+def add_manual_video(title: str, url: str, manual_file: str = None):
+    if manual_file is None:
+        manual_file = str(get_data_path_manager().get_manual_videos_path())
+    """
+    Add a manual video to the collection.
+    
+    Args:
+        title: Video title (e.g., "Artist - Song (Karaoke Version)")
+        url: YouTube URL
+        manual_file: Path to manual videos JSON file
+    """
+    manual_path = Path(manual_file)
+    
+    # Load existing data or create new
+    if manual_path.exists():
+        with open(manual_path, 'r', encoding='utf-8') as f:
+            data = json.load(f)
+    else:
+        data = {
+            "channel_name": "@ManualVideos",
+            "channel_url": "manual://static",
+            "description": "Manual collection of individual karaoke videos",
+            "videos": [],
+            "parsing_rules": {
+                "format": "artist_title_separator",
+                "separator": " - ",
+                "artist_first": true,
+                "title_cleanup": {
+                    "remove_suffix": {
+                        "suffixes": ["(Karaoke)", "(Karaoke Version)", "(Karaoke Version) Lyrics"]
+                    }
+                }
+            }
+        }
+    
+    # Extract video ID
+    video_id = extract_video_id(url)
+    if not video_id:
+        print(f"❌ Could not extract video ID from URL: {url}")
+        return False
+    
+    # Check if video already exists
+    existing_ids = [video.get("id") for video in data["videos"]]
+    if video_id in existing_ids:
+        print(f"⚠️  Video already exists: {title}")
+        return False
+    
+    # Add new video
+    new_video = {
+        "title": title,
+        "url": url,
+        "id": video_id,
+        "upload_date": "2024-01-01",  # Default date
+        "duration": 180,  # Default duration
+        "view_count": 1000  # Default view count
+    }
+    
+    data["videos"].append(new_video)
+    
+    # Save updated data
+    manual_path.parent.mkdir(parents=True, exist_ok=True)
+    with open(manual_path, 'w', encoding='utf-8') as f:
+        json.dump(data, f, indent=2, ensure_ascii=False)
+    
+    print(f"✅ Added video: {title}")
+    print(f"   URL: {url}")
+    print(f"   ID: {video_id}")
+    return True
+
+def list_manual_videos(manual_file: str = None):
+    if manual_file is None:
+        manual_file = str(get_data_path_manager().get_manual_videos_path())
+    """List all manual videos."""
+    manual_path = Path(manual_file)
+    
+    if not manual_path.exists():
+        print("❌ No manual videos file found")
+        return
+    
+    with open(manual_path, 'r', encoding='utf-8') as f:
+        data = json.load(f)
+    
+    print(f"📋 Manual Videos ({len(data['videos'])} videos):")
+    print("=" * 60)
+    
+    for i, video in enumerate(data['videos'], 1):
+        print(f"{i:2d}. {video['title']}")
+        print(f"    URL: {video['url']}")
+        print(f"    ID: {video['id']}")
+        print()
+
+def remove_manual_video(video_id: str, manual_file: str = None):
+    if manual_file is None:
+        manual_file = str(get_data_path_manager().get_manual_videos_path())
+    """Remove a manual video by ID."""
+    manual_path = Path(manual_file)
+    
+    if not manual_path.exists():
+        print("❌ No manual videos file found")
+        return False
+    
+    with open(manual_path, 'r', encoding='utf-8') as f:
+        data = json.load(f)
+    
+    # Find and remove video
+    for i, video in enumerate(data['videos']):
+        if video['id'] == video_id:
+            removed_video = data['videos'].pop(i)
+            with open(manual_path, 'w', encoding='utf-8') as f:
+                json.dump(data, f, indent=2, ensure_ascii=False)
+            print(f"✅ Removed video: {removed_video['title']}")
+            return True
+    
+    print(f"❌ Video with ID '{video_id}' not found")
+    return False
+
+def main():
+    """Interactive mode for adding manual videos."""
+    print("🎤 Manual Video Manager")
+    print("=" * 30)
+    print("1. Add video")
+    print("2. List videos")
+    print("3. Remove video")
+    print("4. Exit")
+    
+    while True:
+        choice = input("\nSelect option (1-4): ").strip()
+        
+        if choice == "1":
+            title = input("Enter video title (e.g., 'Artist - Song (Karaoke Version)'): ").strip()
+            url = input("Enter YouTube URL: ").strip()
+            
+            if title and url:
+                add_manual_video(title, url)
+            else:
+                print("❌ Title and URL are required")
+        
+        elif choice == "2":
+            list_manual_videos()
+        
+        elif choice == "3":
+            video_id = input("Enter video ID to remove: ").strip()
+            if video_id:
+                remove_manual_video(video_id)
+            else:
+                print("❌ Video ID is required")
+        
+        elif choice == "4":
+            print("👋 Goodbye!")
+            break
+        
+        else:
+            print("❌ Invalid option")
+
+if __name__ == "__main__":
+    import sys
+    
+    if len(sys.argv) > 1:
+        # Command line mode
+        if sys.argv[1] == "add" and len(sys.argv) >= 4:
+            add_manual_video(sys.argv[2], sys.argv[3])
+        elif sys.argv[1] == "list":
+            list_manual_videos()
+        elif sys.argv[1] == "remove" and len(sys.argv) >= 3:
+            remove_manual_video(sys.argv[2])
+        else:
+            print("Usage:")
+            print("  python add_manual_video.py add 'Title' 'URL'")
+            print("  python add_manual_video.py list")
+            print("  python add_manual_video.py remove VIDEO_ID")
+    else:
+        # Interactive mode
+        main() 
--- a/utilities/build_cache_from_raw.py
+++ b/utilities/build_cache_from_raw.py
@ -0,0 +1,127 @@
+#!/usr/bin/env python3
+"""
+Script to build channel cache from raw yt-dlp output file.
+This uses the fixed parsing logic to handle titles with | characters.
+"""
+
+import json
+import re
+from datetime import datetime
+from pathlib import Path
+
+from karaoke_downloader.data_path_manager import get_data_path_manager
+
+def parse_raw_output_file(raw_file_path):
+    """Parse the raw output file and extract valid videos."""
+    videos = []
+    invalid_count = 0
+    
+    print(f"🔍 Parsing raw output file: {raw_file_path}")
+    
+    with open(raw_file_path, 'r', encoding='utf-8') as f:
+        lines = f.readlines()
+    
+    # Skip header lines (lines starting with #)
+    data_lines = [line for line in lines if not line.strip().startswith('#') and line.strip()]
+    
+    print(f"📄 Found {len(data_lines)} data lines to process")
+    
+    for i, line in enumerate(data_lines):
+        if i % 1000 == 0 and i > 0:  # Progress indicator every 1000 lines
+            print(f"📊 Processing line {i}/{len(data_lines)}... ({i/len(data_lines)*100:.1f}%)")
+        
+        # Remove line number prefix (e.g., "  1234: ")
+        line = re.sub(r'^\s*\d+:\s*', '', line.strip())
+        
+        # More robust parsing that handles titles with | characters
+        # Extract video ID directly from the URL that yt-dlp provides
+        
+        # Find the URL and extract video ID from it
+        url_match = re.search(r'https://www\.youtube\.com/watch\?v=([a-zA-Z0-9_-]{11})', line)
+        if not url_match:
+            invalid_count += 1
+            if invalid_count <= 5:
+                print(f"⚠️  Skipping line with no URL: '{line[:100]}...'")
+            elif invalid_count == 6:
+                print(f"⚠️  ... and {len(data_lines) - i - 1} more invalid lines")
+            continue
+        
+        # Extract video ID directly from the URL
+        video_id = url_match.group(1)
+        
+        # Extract title (everything before the video ID in the line)
+        title = line[:line.find(video_id)].rstrip('|').strip()
+        
+        # Validate video ID
+        if video_id and (
+            len(video_id) == 11 and 
+            video_id.replace('-', '').replace('_', '').isalnum() and
+            " " not in video_id and 
+            "Lyrics" not in video_id and
+            "KARAOKE" not in video_id.upper() and
+            "Vocal" not in video_id and
+            "Guide" not in video_id
+        ):
+            videos.append({"title": title, "id": video_id})
+        else:
+            invalid_count += 1
+            if invalid_count <= 5:  # Only show first 5 invalid IDs
+                print(f"⚠️  Skipping invalid video ID: '{video_id}' for title: '{title[:50]}...'")
+            elif invalid_count == 6:
+                print(f"⚠️  ... and {len(data_lines) - i - 1} more invalid IDs")
+    
+    print(f"✅ Parsed {len(videos)} valid videos from raw output")
+    print(f"⚠️  Skipped {invalid_count} invalid video IDs")
+    
+    return videos
+
+def save_cache_file(channel_id, videos, cache_dir=None):
+    if cache_dir is None:
+        cache_dir = str(get_data_path_manager().get_channel_cache_dir())
+    """Save the parsed videos to a cache file."""
+    cache_dir = Path(cache_dir)
+    cache_dir.mkdir(parents=True, exist_ok=True)
+    
+    # Sanitize channel ID for filename
+    safe_channel_id = re.sub(r'[<>:"/\\|?*]', '_', channel_id)
+    cache_file = cache_dir / f"{safe_channel_id}.json"
+    
+    data = {
+        'channel_id': channel_id,
+        'videos': videos,
+        'last_updated': datetime.now().isoformat(),
+        'video_count': len(videos)
+    }
+    
+    with open(cache_file, 'w', encoding='utf-8') as f:
+        json.dump(data, f, indent=2, ensure_ascii=False)
+    
+    print(f"💾 Saved cache to: {cache_file.name}")
+    return cache_file
+
+def main():
+    """Main function to build cache from raw output."""
+    data_path_manager = get_data_path_manager()
+    raw_file_path = data_path_manager.get_channel_cache_dir() / "@VocalStarKaraoke_raw_output.txt"
+    
+    if not raw_file_path.exists():
+        print(f"❌ Raw output file not found: {raw_file_path}")
+        return
+    
+    # Parse the raw output file
+    videos = parse_raw_output_file(raw_file_path)
+    
+    if not videos:
+        print("❌ No valid videos found")
+        return
+    
+    # Save to cache file
+    channel_id = "@VocalStarKaraoke"
+    cache_file = save_cache_file(channel_id, videos)
+    
+    print(f"🎉 Cache build complete!")
+    print(f"📊 Total videos in cache: {len(videos)}")
+    print(f"📁 Cache file: {cache_file}")
+
+if __name__ == "__main__":
+    main() 
--- a/utilities/cleanup_duplicate_files.py
+++ b/utilities/cleanup_duplicate_files.py
@ -0,0 +1,164 @@
+#!/usr/bin/env python3
+"""
+Utility script to identify and clean up duplicate files with (2), (3) suffixes.
+This helps clean up files that were created before the duplicate prevention was implemented.
+"""
+
+import json
+import re
+from pathlib import Path
+from typing import Dict, List, Tuple
+
+def find_duplicate_files(downloads_dir: str = "downloads") -> Dict[str, List[Path]]:
+    """
+    Find duplicate files with (2), (3), etc. suffixes in the downloads directory.
+    
+    Args:
+        downloads_dir: Path to downloads directory
+        
+    Returns:
+        Dictionary mapping base filenames to lists of duplicate files
+    """
+    downloads_path = Path(downloads_dir)
+    if not downloads_path.exists():
+        print(f"❌ Downloads directory not found: {downloads_dir}")
+        return {}
+    
+    duplicates = {}
+    
+    # Scan all MP4 files in the downloads directory
+    for mp4_file in downloads_path.rglob("*.mp4"):
+        filename = mp4_file.name
+        
+        # Check if this is a duplicate file with (2), (3), etc.
+        match = re.match(r'^(.+?)\s*\((\d+)\)\.mp4$', filename)
+        if match:
+            base_name = match.group(1)
+            suffix_num = int(match.group(2))
+            
+            if base_name not in duplicates:
+                duplicates[base_name] = []
+            
+            duplicates[base_name].append((mp4_file, suffix_num))
+    
+    # Sort duplicates by suffix number
+    for base_name in duplicates:
+        duplicates[base_name].sort(key=lambda x: x[1])
+    
+    return duplicates
+
+def analyze_duplicates(duplicates: Dict[str, List[Tuple[Path, int]]]) -> None:
+    """
+    Analyze and display information about found duplicates.
+    
+    Args:
+        duplicates: Dictionary of duplicate files
+    """
+    if not duplicates:
+        print("✅ No duplicate files found!")
+        return
+    
+    print(f"🔍 Found {len(duplicates)} sets of duplicate files:")
+    print()
+    
+    total_duplicates = 0
+    for base_name, files in duplicates.items():
+        print(f"📁 {base_name}")
+        for file_path, suffix in files:
+            file_size = file_path.stat().st_size / (1024 * 1024)  # MB
+            print(f"   ({suffix}) {file_path.name} - {file_size:.1f} MB")
+        print()
+        total_duplicates += len(files) - 1  # -1 because we keep the original
+    
+    print(f"📊 Summary: {len(duplicates)} base files with {total_duplicates} duplicate files")
+
+def cleanup_duplicates(duplicates: Dict[str, List[Tuple[Path, int]]], dry_run: bool = True) -> None:
+    """
+    Clean up duplicate files, keeping only the first occurrence.
+    
+    Args:
+        duplicates: Dictionary of duplicate files
+        dry_run: If True, only show what would be deleted without actually deleting
+    """
+    if not duplicates:
+        print("✅ No duplicates to clean up!")
+        return
+    
+    mode = "DRY RUN" if dry_run else "ACTUAL CLEANUP"
+    print(f"🧹 Starting {mode}...")
+    print()
+    
+    total_deleted = 0
+    total_size_freed = 0
+    
+    for base_name, files in duplicates.items():
+        print(f"📁 Processing: {base_name}")
+        
+        # Keep the first file (lowest suffix number), delete the rest
+        files_to_delete = files[1:]  # Skip the first file
+        
+        for file_path, suffix in files_to_delete:
+            file_size = file_path.stat().st_size / (1024 * 1024)  # MB
+            
+            if dry_run:
+                print(f"   🗑️  Would delete: {file_path.name} ({file_size:.1f} MB)")
+            else:
+                try:
+                    file_path.unlink()
+                    print(f"   ✅ Deleted: {file_path.name} ({file_size:.1f} MB)")
+                    total_deleted += 1
+                    total_size_freed += file_size
+                except Exception as e:
+                    print(f"   ❌ Failed to delete {file_path.name}: {e}")
+        
+        print()
+    
+    if dry_run:
+        print(f"📊 DRY RUN SUMMARY: Would delete {len([f for files in duplicates.values() for f in files[1:]])} files")
+    else:
+        print(f"📊 CLEANUP SUMMARY: Deleted {total_deleted} files, freed {total_size_freed:.1f} MB")
+
+def main():
+    """Main function to run the duplicate file cleanup."""
+    print("🎵 Karaoke Video Downloader - Duplicate File Cleanup")
+    print("=" * 50)
+    print()
+    
+    # Find duplicates
+    duplicates = find_duplicate_files()
+    
+    if not duplicates:
+        print("✅ No duplicate files found!")
+        return
+    
+    # Analyze duplicates
+    analyze_duplicates(duplicates)
+    print()
+    
+    # Ask user what to do
+    while True:
+        print("Options:")
+        print("1. Dry run (show what would be deleted)")
+        print("2. Actually delete duplicate files")
+        print("3. Exit without doing anything")
+        
+        choice = input("\nEnter your choice (1-3): ").strip()
+        
+        if choice == "1":
+            cleanup_duplicates(duplicates, dry_run=True)
+            break
+        elif choice == "2":
+            confirm = input("⚠️  Are you sure you want to delete duplicate files? (yes/no): ").strip().lower()
+            if confirm in ["yes", "y"]:
+                cleanup_duplicates(duplicates, dry_run=False)
+            else:
+                print("❌ Cleanup cancelled.")
+            break
+        elif choice == "3":
+            print("❌ Exiting without cleanup.")
+            break
+        else:
+            print("❌ Invalid choice. Please enter 1, 2, or 3.")
+
+if __name__ == "__main__":
+    main() 
--- a/utilities/cleanup_recent_tracking.py
+++ b/utilities/cleanup_recent_tracking.py
@ -2,7 +2,11 @@ import json
 from pathlib import Path
 from datetime import datetime, time

-def cleanup_recent_tracking(tracking_path="data/songlist_tracking.json", cutoff_time_str="11:00"):
+from karaoke_downloader.data_path_manager import get_data_path_manager
+
+def cleanup_recent_tracking(tracking_path=None, cutoff_time_str="11:00"):
+    if tracking_path is None:
+        tracking_path = str(get_data_path_manager().get_songlist_tracking_path())
    """Remove entries from songlist_tracking.json that were added after the specified time today."""
    tracking_file = Path(tracking_path)
    if not tracking_file.exists():
--- a/utilities/deduplicate_songlist_tracking.py
+++ b/utilities/deduplicate_songlist_tracking.py
--- a/utilities/fix_artist_name_format.py
+++ b/utilities/fix_artist_name_format.py
@ -0,0 +1,465 @@
+#!/usr/bin/env python3
+"""
+Fix artist name formatting for Let's Sing Karaoke channel.
+
+This script specifically targets the "Last Name, First Name" format and converts it to
+"First Name Last Name" format in ID3 tags. It only processes entries where there is exactly one comma
+followed by exactly 2 words, to avoid affecting multi-artist entries.
+
+Usage:
+    python fix_artist_name_format.py --preview  # Show what would be changed
+    python fix_artist_name_format.py --apply    # Actually make the changes
+    python fix_artist_name_format.py --external "D:\Karaoke\Karaoke\MP4\Let's Sing Karaoke"  # Use external directory
+"""
+
+import json
+import os
+import re
+import shutil
+import argparse
+from pathlib import Path
+from typing import Dict, List, Tuple, Optional
+
+# Try to import mutagen for ID3 tag manipulation
+try:
+    from mutagen.mp4 import MP4
+    MUTAGEN_AVAILABLE = True
+except ImportError:
+    MUTAGEN_AVAILABLE = False
+    print("⚠️  mutagen not available - install with: pip install mutagen")
+
+
+def is_lastname_firstname_format(artist_name: str) -> bool:
+    """
+    Check if artist name is in "Last Name, First Name" format.
+    
+    Args:
+        artist_name: The artist name to check
+        
+    Returns:
+        True if the name matches "Last Name, First Name" format with exactly 2 words after comma
+    """
+    if ',' not in artist_name:
+        return False
+    
+    # Split by comma
+    parts = artist_name.split(',', 1)
+    if len(parts) != 2:
+        return False
+    
+    last_name = parts[0].strip()
+    first_name_part = parts[1].strip()
+    
+    # Check if there are exactly 2 words after the comma
+    words_after_comma = first_name_part.split()
+    if len(words_after_comma) != 2:
+        return False
+    
+    # Additional check: make sure it's not a multi-artist entry
+    # If there are more than 2 words total in the artist name, it might be multi-artist
+    total_words = len(artist_name.split())
+    if total_words > 4:  # Last, First Name (4 words max for single artist)
+        return False
+    
+    return True
+
+
+def convert_to_firstname_lastname(artist_name: str) -> str:
+    """
+    Convert "Last Name, First Name" to "First Name Last Name".
+    
+    Args:
+        artist_name: Artist name in "Last Name, First Name" format
+        
+    Returns:
+        Artist name in "First Name Last Name" format
+    """
+    parts = artist_name.split(',', 1)
+    last_name = parts[0].strip()
+    first_name_part = parts[1].strip()
+    
+    # Split the first name part into words
+    words = first_name_part.split()
+    if len(words) == 2:
+        first_name = words[0]
+        middle_name = words[1]
+        return f"{first_name} {middle_name} {last_name}"
+    else:
+        # Fallback - just reverse the parts
+        return f"{first_name_part} {last_name}"
+
+
+def extract_artist_title_from_filename(filename: str) -> Tuple[str, str]:
+    """
+    Extract artist and title from a filename.
+    
+    Args:
+        filename: MP4 filename (without extension)
+        
+    Returns:
+        Tuple of (artist, title)
+    """
+    # Remove .mp4 extension
+    if filename.endswith('.mp4'):
+        filename = filename[:-4]
+    
+    # Look for " - " separator
+    if " - " in filename:
+        parts = filename.split(" - ", 1)
+        return parts[0].strip(), parts[1].strip()
+    
+    return "", filename
+
+
+def update_id3_tags(file_path: str, new_artist: str, apply_changes: bool = False) -> bool:
+    """
+    Update the ID3 tags in an MP4 file.
+    
+    Args:
+        file_path: Path to the MP4 file
+        new_artist: New artist name to set
+        apply_changes: Whether to actually apply changes or just preview
+        
+    Returns:
+        True if successful, False otherwise
+    """
+    if not MUTAGEN_AVAILABLE:
+        print(f"⚠️  mutagen not available - cannot update ID3 tags for {file_path}")
+        return False
+    
+    try:
+        mp4 = MP4(file_path)
+        
+        if apply_changes:
+            # Update the artist tag
+            mp4["\xa9ART"] = new_artist
+            mp4.save()
+            print(f"📝 Updated ID3 tag: {os.path.basename(file_path)} → Artist: '{new_artist}'")
+        else:
+            # Just preview what would be changed
+            current_artist = mp4.get("\xa9ART", ["Unknown"])[0] if "\xa9ART" in mp4 else "Unknown"
+            print(f"📝 Would update ID3 tag: {os.path.basename(file_path)} → Artist: '{current_artist}' → '{new_artist}'")
+        
+        return True
+        
+    except Exception as e:
+        print(f"❌ Failed to update ID3 tags for {file_path}: {e}")
+        return False
+
+
+def scan_external_directory(directory_path: str) -> List[Dict]:
+    """
+    Scan external directory for MP4 files with "Last Name, First Name" format in ID3 tags.
+    
+    Args:
+        directory_path: Path to the external directory
+        
+    Returns:
+        List of files that need ID3 tag updates
+    """
+    if not os.path.exists(directory_path):
+        print(f"❌ Directory not found: {directory_path}")
+        return []
+    
+    if not MUTAGEN_AVAILABLE:
+        print("❌ mutagen not available - cannot scan ID3 tags")
+        return []
+    
+    files_to_update = []
+    
+    # Scan for MP4 files
+    for file_path in Path(directory_path).glob("*.mp4"):
+        try:
+            mp4 = MP4(str(file_path))
+            current_artist = mp4.get("\xa9ART", ["Unknown"])[0] if "\xa9ART" in mp4 else "Unknown"
+            
+            if current_artist and is_lastname_firstname_format(current_artist):
+                new_artist = convert_to_firstname_lastname(current_artist)
+                
+                files_to_update.append({
+                    'file_path': str(file_path),
+                    'filename': file_path.name,
+                    'old_artist': current_artist,
+                    'new_artist': new_artist
+                })
+                
+        except Exception as e:
+            print(f"⚠️  Could not read ID3 tags from {file_path.name}: {e}")
+    
+    return files_to_update
+
+
+def update_tracking_file(tracking_file: str, channel_name: str = "Let's Sing Karaoke", apply_changes: bool = False) -> Tuple[int, List[Dict]]:
+    """
+    Update the karaoke tracking file to fix artist name formatting.
+    
+    Args:
+        tracking_file: Path to the tracking JSON file
+        channel_name: Channel name to target (default: Let's Sing Karaoke)
+        apply_changes: Whether to actually apply changes or just preview
+        
+    Returns:
+        Tuple of (number of changes made, list of changed entries)
+    """
+    if not os.path.exists(tracking_file):
+        print(f"❌ Tracking file not found: {tracking_file}")
+        return 0, []
+    
+    # Load the tracking data
+    with open(tracking_file, 'r', encoding='utf-8') as f:
+        data = json.load(f)
+    
+    changes_made = 0
+    changed_entries = []
+    
+    # Process songs
+    for song_key, song_data in data.get('songs', {}).items():
+        if song_data.get('channel_name') != channel_name:
+            continue
+        
+        artist = song_data.get('artist', '')
+        if not artist or not is_lastname_firstname_format(artist):
+            continue
+        
+        # Convert the artist name
+        new_artist = convert_to_firstname_lastname(artist)
+        
+        if apply_changes:
+            # Update the tracking data
+            song_data['artist'] = new_artist
+            
+            # Update the video title if it exists and contains the old artist name
+            video_title = song_data.get('video_title', '')
+            if video_title and artist in video_title:
+                song_data['video_title'] = video_title.replace(artist, new_artist)
+            
+            # Update the file path if it exists
+            file_path = song_data.get('file_path', '')
+            if file_path and artist in file_path:
+                song_data['file_path'] = file_path.replace(artist, new_artist)
+        
+        changes_made += 1
+        changed_entries.append({
+            'song_key': song_key,
+            'old_artist': artist,
+            'new_artist': new_artist,
+            'title': song_data.get('title', ''),
+            'file_path': song_data.get('file_path', '')
+        })
+        
+        print(f"🔄 {'Updated' if apply_changes else 'Would update'}: '{artist}' → '{new_artist}' ({song_data.get('title', '')})")
+    
+    # Save the updated data
+    if apply_changes and changes_made > 0:
+        # Create backup
+        backup_file = f"{tracking_file}.backup"
+        shutil.copy2(tracking_file, backup_file)
+        print(f"💾 Created backup: {backup_file}")
+        
+        # Save updated file
+        with open(tracking_file, 'w', encoding='utf-8') as f:
+            json.dump(data, f, indent=2, ensure_ascii=False)
+        print(f"💾 Updated tracking file: {tracking_file}")
+    
+    return changes_made, changed_entries
+
+
+def update_songlist_tracking(songlist_file: str, channel_name: str = "Let's Sing Karaoke", apply_changes: bool = False) -> Tuple[int, List[Dict]]:
+    """
+    Update the songlist tracking file to fix artist name formatting.
+    
+    Args:
+        songlist_file: Path to the songlist tracking JSON file
+        channel_name: Channel name to target (default: Let's Sing Karaoke)
+        apply_changes: Whether to actually apply changes or just preview
+        
+    Returns:
+        Tuple of (number of changes made, list of changed entries)
+    """
+    if not os.path.exists(songlist_file):
+        print(f"❌ Songlist tracking file not found: {songlist_file}")
+        return 0, []
+    
+    # Load the songlist data
+    with open(songlist_file, 'r', encoding='utf-8') as f:
+        data = json.load(f)
+    
+    changes_made = 0
+    changed_entries = []
+    
+    # Process songlist entries
+    for song_key, song_data in data.items():
+        artist = song_data.get('artist', '')
+        if not artist or not is_lastname_firstname_format(artist):
+            continue
+        
+        # Convert the artist name
+        new_artist = convert_to_firstname_lastname(artist)
+        
+        if apply_changes:
+            # Update the songlist data
+            song_data['artist'] = new_artist
+        
+        changes_made += 1
+        changed_entries.append({
+            'song_key': song_key,
+            'old_artist': artist,
+            'new_artist': new_artist,
+            'title': song_data.get('title', '')
+        })
+        
+        print(f"🔄 {'Updated' if apply_changes else 'Would update'} songlist: '{artist}' → '{new_artist}' ({song_data.get('title', '')})")
+    
+    # Save the updated data
+    if apply_changes and changes_made > 0:
+        # Create backup
+        backup_file = f"{songlist_file}.backup"
+        shutil.copy2(songlist_file, backup_file)
+        print(f"💾 Created backup: {backup_file}")
+        
+        # Save updated file
+        with open(songlist_file, 'w', encoding='utf-8') as f:
+            json.dump(data, f, indent=2, ensure_ascii=False)
+        print(f"💾 Updated songlist file: {songlist_file}")
+    
+    return changes_made, changed_entries
+
+
+def update_id3_tags_for_files(files_to_update: List[Dict], apply_changes: bool = False) -> int:
+    """
+    Update ID3 tags for a list of files.
+    
+    Args:
+        files_to_update: List of files to update
+        apply_changes: Whether to actually apply changes or just preview
+        
+    Returns:
+        Number of files successfully updated
+    """
+    updated_count = 0
+    
+    for file_info in files_to_update:
+        file_path = file_info['file_path']
+        new_artist = file_info['new_artist']
+        
+        if update_id3_tags(file_path, new_artist, apply_changes):
+            updated_count += 1
+    
+    return updated_count
+
+
+def main():
+    """Main function to run the artist name fix script."""
+    parser = argparse.ArgumentParser(description="Fix artist name formatting in ID3 tags for Let's Sing Karaoke")
+    parser.add_argument('--preview', action='store_true', help='Show what would be changed without making changes')
+    parser.add_argument('--apply', action='store_true', help='Actually apply the changes')
+    parser.add_argument('--external', type=str, help='Path to external karaoke directory')
+    
+    args = parser.parse_args()
+    
+    # Default to preview mode if no action specified
+    if not args.preview and not args.apply:
+        args.preview = True
+    
+    print("🎤 Artist Name Format Fix Script (ID3 Tags Only)")
+    print("=" * 60)
+    print("This script will fix 'Last Name, First Name' format to 'First Name Last Name'")
+    print("Only targeting Let's Sing Karaoke channel to avoid affecting other channels.")
+    print("Focusing on ID3 tags only - filenames will not be changed.")
+    print()
+    
+    if not MUTAGEN_AVAILABLE:
+        print("❌ mutagen library not available!")
+        print("Please install it with: pip install mutagen")
+        return
+    
+    if args.preview:
+        print("🔍 PREVIEW MODE - No changes will be made")
+    else:
+        print("⚡ APPLY MODE - Changes will be made")
+    print()
+    
+    # File paths
+    tracking_file = "data/karaoke_tracking.json"
+    songlist_file = "data/songlist_tracking.json"
+    
+    # Process external directory if specified
+    if args.external:
+        print(f"📁 Scanning external directory: {args.external}")
+        external_files = scan_external_directory(args.external)
+        
+        if external_files:
+            print(f"\n📋 Found {len(external_files)} files with 'Last Name, First Name' format in ID3 tags:")
+            for file_info in external_files:
+                print(f"   • {file_info['filename']}: '{file_info['old_artist']}' → '{file_info['new_artist']}'")
+            
+            if args.apply:
+                print(f"\n📝 Updating ID3 tags in external files...")
+                updated_count = update_id3_tags_for_files(external_files, apply_changes=True)
+                print(f"✅ Updated ID3 tags in {updated_count} external files")
+            else:
+                print(f"\n📝 Would update ID3 tags in {len(external_files)} external files")
+        else:
+            print("✅ No files with 'Last Name, First Name' format found in ID3 tags")
+    
+    # Process tracking files (only if they exist in current project)
+    if os.path.exists(tracking_file):
+        print(f"\n📊 Processing karaoke tracking file...")
+        tracking_changes, tracking_entries = update_tracking_file(tracking_file, apply_changes=args.apply)
+    else:
+        print(f"\n⚠️  Tracking file not found: {tracking_file}")
+        tracking_changes = 0
+    
+    if os.path.exists(songlist_file):
+        print(f"\n📊 Processing songlist tracking file...")
+        songlist_changes, songlist_entries = update_songlist_tracking(songlist_file, apply_changes=args.apply)
+    else:
+        print(f"\n⚠️  Songlist tracking file not found: {songlist_file}")
+        songlist_changes = 0
+    
+    # Process local downloads directory ID3 tags
+    downloads_dir = "downloads"
+    local_id3_updates = 0
+    if os.path.exists(downloads_dir) and tracking_changes > 0:
+        print(f"\n📝 Processing ID3 tags in local downloads directory...")
+        # Scan local downloads for files that need ID3 tag updates
+        local_files = []
+        for entry in tracking_entries:
+            file_path = entry.get('file_path', '')
+            if file_path and os.path.exists(file_path.replace('\\', '/')):
+                local_files.append({
+                    'file_path': file_path.replace('\\', '/'),
+                    'filename': os.path.basename(file_path),
+                    'old_artist': entry['old_artist'],
+                    'new_artist': entry['new_artist']
+                })
+        
+        if local_files:
+            local_id3_updates = update_id3_tags_for_files(local_files, apply_changes=args.apply)
+    
+    total_changes = tracking_changes + songlist_changes
+    
+    print("\n" + "=" * 60)
+    print("📋 Summary:")
+    print(f"   • Tracking file changes: {tracking_changes}")
+    print(f"   • Songlist file changes: {songlist_changes}")
+    print(f"   • Local ID3 tag updates: {local_id3_updates}")
+    print(f"   • Total changes: {total_changes}")
+    
+    if args.external:
+        external_count = len(scan_external_directory(args.external)) if args.preview else len(external_files)
+        print(f"   • External ID3 tag updates: {external_count}")
+    
+    if total_changes > 0 or (args.external and external_count > 0):
+        if args.apply:
+            print("\n✅ Artist name formatting in ID3 tags has been fixed!")
+            print("💾 Backups have been created for all modified files.")
+            print("🔄 You may need to re-run your karaoke downloader to update any cached data.")
+        else:
+            print("\n🔍 Preview complete. Use --apply to make these changes.")
+    else:
+        print("\n✅ No changes needed! All artist names are already in the correct format.")
+
+
+if __name__ == "__main__":
+    main() 
--- a/utilities/fix_artist_name_format_simple.py
+++ b/utilities/fix_artist_name_format_simple.py
@ -0,0 +1,295 @@
+#!/usr/bin/env python3
+"""
+Fix artist name formatting for Let's Sing Karaoke channel.
+
+This script specifically targets the "Last Name, First Name" format and converts it to
+"First Name Last Name" format in ID3 tags. It only processes entries where there is exactly one comma
+followed by exactly 2 words, to avoid affecting multi-artist entries.
+
+Usage:
+    python fix_artist_name_format_simple.py --preview  # Show what would be changed
+    python fix_artist_name_format_simple.py --apply    # Actually make the changes
+    python fix_artist_name_format_simple.py --external "D:\Karaoke\Karaoke\MP4\Let's Sing Karaoke"  # Use external directory
+"""
+
+import json
+import os
+import re
+import shutil
+import argparse
+from pathlib import Path
+from typing import Dict, List, Tuple, Optional
+
+# Try to import mutagen for ID3 tag manipulation
+try:
+    from mutagen.mp4 import MP4
+    MUTAGEN_AVAILABLE = True
+except ImportError:
+    MUTAGEN_AVAILABLE = False
+    print("WARNING: mutagen not available - install with: pip install mutagen")
+
+
+def is_lastname_firstname_format(artist_name: str) -> bool:
+    """
+    Check if artist name is in "Last Name, First Name" format.
+    
+    Args:
+        artist_name: The artist name to check
+        
+    Returns:
+        True if the name matches "Last Name, First Name" format with exactly 1 or 2 words after comma
+    """
+    if ',' not in artist_name:
+        return False
+    
+    # Split by comma
+    parts = artist_name.split(',', 1)
+    if len(parts) != 2:
+        return False
+    
+    last_name = parts[0].strip()
+    first_name_part = parts[1].strip()
+    
+    # Check if there are exactly 1 or 2 words after the comma
+    words_after_comma = first_name_part.split()
+    if len(words_after_comma) not in [1, 2]:
+        return False
+    
+    # Additional check: make sure it's not a multi-artist entry
+    # If there are more than 4 words total in the artist name, it might be multi-artist
+    total_words = len(artist_name.split())
+    if total_words > 4:  # Last, First Name (4 words max for single artist)
+        return False
+    
+    return True
+
+
+def convert_lastname_firstname(artist_name: str) -> str:
+    """
+    Convert "Last Name, First Name" to "First Name Last Name".
+    
+    Args:
+        artist_name: The artist name to convert
+        
+    Returns:
+        The converted artist name
+    """
+    if ',' not in artist_name:
+        return artist_name
+    
+    parts = artist_name.split(',', 1)
+    if len(parts) != 2:
+        return artist_name
+    
+    last_name = parts[0].strip()
+    first_name = parts[1].strip()
+    
+    return f"{first_name} {last_name}"
+
+
+def process_artist_name(artist_name: str) -> str:
+    """
+    Process an artist name, handling both single artists and multiple artists separated by "&".
+    
+    Args:
+        artist_name: The artist name to process
+        
+    Returns:
+        The processed artist name
+    """
+    if '&' in artist_name:
+        # Split by "&" and process each artist individually
+        artists = [artist.strip() for artist in artist_name.split('&')]
+        processed_artists = []
+        
+        for artist in artists:
+            if is_lastname_firstname_format(artist):
+                processed_artist = convert_lastname_firstname(artist)
+                processed_artists.append(processed_artist)
+            else:
+                processed_artists.append(artist)
+        
+        # Rejoin with "&"
+        return ' & '.join(processed_artists)
+    else:
+        # Single artist
+        if is_lastname_firstname_format(artist_name):
+            return convert_lastname_firstname(artist_name)
+        else:
+            return artist_name
+
+
+def update_id3_tags(file_path: str, new_artist: str, apply_changes: bool = False) -> bool:
+    """
+    Update the ID3 tags in an MP4 file.
+    
+    Args:
+        file_path: Path to the MP4 file
+        new_artist: New artist name to set
+        apply_changes: Whether to actually apply changes or just preview
+        
+    Returns:
+        True if successful, False otherwise
+    """
+    if not MUTAGEN_AVAILABLE:
+        print(f"WARNING: mutagen not available - cannot update ID3 tags for {file_path}")
+        return False
+    
+    try:
+        mp4 = MP4(file_path)
+        
+        if apply_changes:
+            # Update the artist tag
+            mp4["\xa9ART"] = new_artist
+            mp4.save()
+            print(f"UPDATED ID3 tag: {os.path.basename(file_path)} -> Artist: '{new_artist}'")
+        else:
+            # Just preview what would be changed
+            current_artist = mp4.get("\xa9ART", ["Unknown"])[0] if "\xa9ART" in mp4 else "Unknown"
+            print(f"WOULD UPDATE ID3 tag: {os.path.basename(file_path)} -> Artist: '{current_artist}' -> '{new_artist}'")
+        
+        return True
+        
+    except Exception as e:
+        print(f"ERROR: Failed to update ID3 tags for {file_path}: {e}")
+        return False
+
+
+def scan_external_directory(directory_path: str, debug: bool = False) -> List[Dict]:
+    """
+    Scan external directory for MP4 files with "Last Name, First Name" format in ID3 tags.
+    
+    Args:
+        directory_path: Path to the external directory
+        debug: Whether to show debug information
+        
+    Returns:
+        List of files that need ID3 tag updates
+    """
+    if not os.path.exists(directory_path):
+        print(f"ERROR: Directory not found: {directory_path}")
+        return []
+    
+    if not MUTAGEN_AVAILABLE:
+        print("ERROR: mutagen not available - cannot scan ID3 tags")
+        return []
+    
+    files_to_update = []
+    total_files = 0
+    files_with_artist_tags = 0
+    
+    # Scan for MP4 files
+    for file_path in Path(directory_path).glob("*.mp4"):
+        total_files += 1
+        try:
+            mp4 = MP4(str(file_path))
+            current_artist = mp4.get("\xa9ART", ["Unknown"])[0] if "\xa9ART" in mp4 else "Unknown"
+            
+            if current_artist != "Unknown":
+                files_with_artist_tags += 1
+                
+                if debug:
+                    print(f"DEBUG: {file_path.name} -> Artist: '{current_artist}'")
+                
+                # Process the artist name to handle multiple artists
+                processed_artist = process_artist_name(current_artist)
+                
+                if processed_artist != current_artist:
+                    files_to_update.append({
+                        'file_path': str(file_path),
+                        'filename': file_path.name,
+                        'old_artist': current_artist,
+                        'new_artist': processed_artist
+                    })
+                    
+                    if debug:
+                        print(f"DEBUG: MATCH FOUND - {file_path.name}: '{current_artist}' -> '{processed_artist}'")
+                
+        except Exception as e:
+            if debug:
+                print(f"WARNING: Could not read ID3 tags from {file_path.name}: {e}")
+    
+    print(f"INFO: Scanned {total_files} MP4 files, {files_with_artist_tags} had artist tags, {len(files_to_update)} need updates")
+    return files_to_update
+
+
+def update_id3_tags_for_files(files_to_update: List[Dict], apply_changes: bool = False) -> int:
+    """
+    Update ID3 tags for a list of files.
+    
+    Args:
+        files_to_update: List of files to update
+        apply_changes: Whether to actually apply changes or just preview
+        
+    Returns:
+        Number of files successfully updated
+    """
+    updated_count = 0
+    
+    for file_info in files_to_update:
+        file_path = file_info['file_path']
+        new_artist = file_info['new_artist']
+        
+        if update_id3_tags(file_path, new_artist, apply_changes):
+            updated_count += 1
+    
+    return updated_count
+
+
+def main():
+    """Main function to run the artist name fix script."""
+    parser = argparse.ArgumentParser(description="Fix artist name formatting in ID3 tags for Let's Sing Karaoke")
+    parser.add_argument('--preview', action='store_true', help='Show what would be changed without making changes')
+    parser.add_argument('--apply', action='store_true', help='Actually apply the changes')
+    parser.add_argument('--external', type=str, help='Path to external karaoke directory')
+    parser.add_argument('--debug', action='store_true', help='Show debug information')
+    
+    args = parser.parse_args()
+    
+    # Default to preview mode if no action specified
+    if not args.preview and not args.apply:
+        args.preview = True
+    
+    print("Artist Name Format Fix Script (ID3 Tags Only)")
+    print("=" * 60)
+    print("This script will fix 'Last Name, First Name' format to 'First Name Last Name'")
+    print("Only targeting Let's Sing Karaoke channel to avoid affecting other channels.")
+    print("Focusing on ID3 tags only - filenames will not be changed.")
+    print()
+    
+    if not MUTAGEN_AVAILABLE:
+        print("ERROR: mutagen library not available!")
+        print("Please install it with: pip install mutagen")
+        return
+    
+    if args.preview:
+        print("PREVIEW MODE - No changes will be made")
+    else:
+        print("APPLY MODE - Changes will be made")
+    print()
+    
+    # Process external directory if specified
+    if args.external:
+        print(f"Scanning external directory: {args.external}")
+        external_files = scan_external_directory(args.external, debug=args.debug)
+        
+        if external_files:
+            print(f"\nFound {len(external_files)} files with 'Last Name, First Name' format in ID3 tags:")
+            for file_info in external_files:
+                print(f"  * {file_info['filename']}: '{file_info['old_artist']}' -> '{file_info['new_artist']}'")
+            
+            if args.apply:
+                print(f"\nUpdating ID3 tags in external files...")
+                updated_count = update_id3_tags_for_files(external_files, apply_changes=True)
+                print(f"SUCCESS: Updated ID3 tags in {updated_count} external files")
+            else:
+                print(f"\nWould update ID3 tags in {len(external_files)} external files")
+        else:
+            print("SUCCESS: No files with 'Last Name, First Name' format found in ID3 tags")
+    
+    print("\n" + "=" * 60)
+    print("Summary complete.")
+
+
+if __name__ == "__main__":
+    main() 
--- a/utilities/fix_code_quality.py
+++ b/utilities/fix_code_quality.py
--- a/utilities/reset_and_redownload.py
+++ b/utilities/reset_and_redownload.py
@ -0,0 +1,151 @@
+#!/usr/bin/env python3
+"""
+Script to reset karaoke tracking and re-download files with the new channel parser.
+
+This script will:
+1. Reset the karaoke_tracking.json to remove all downloaded entries
+2. Optionally delete the downloaded files
+3. Allow you to re-download with the new channel parser system
+"""
+
+import json
+import os
+import shutil
+from pathlib import Path
+from typing import List, Dict, Any
+
+from karaoke_downloader.data_path_manager import get_data_path_manager
+
+
+def reset_karaoke_tracking(tracking_file: str = None) -> None:
+    if tracking_file is None:
+        tracking_file = str(get_data_path_manager().get_karaoke_tracking_path())
+    """Reset the karaoke tracking file to empty state."""
+    print(f"Resetting {tracking_file}...")
+    
+    # Create backup of current tracking
+    backup_file = f"{tracking_file}.backup"
+    if os.path.exists(tracking_file):
+        shutil.copy2(tracking_file, backup_file)
+        print(f"Created backup: {backup_file}")
+    
+    # Reset to empty state
+    empty_tracking = {
+        "playlists": {},
+        "songs": {}
+    }
+    
+    with open(tracking_file, 'w', encoding='utf-8') as f:
+        json.dump(empty_tracking, f, indent=2, ensure_ascii=False)
+    
+    print(f"✅ Reset {tracking_file} to empty state")
+
+
+def delete_downloaded_files(downloads_dir: str = "downloads") -> None:
+    """Delete all downloaded files and folders."""
+    if not os.path.exists(downloads_dir):
+        print(f"Downloads directory {downloads_dir} does not exist.")
+        return
+    
+    print(f"Deleting all files in {downloads_dir}...")
+    
+    try:
+        shutil.rmtree(downloads_dir)
+        print(f"✅ Deleted {downloads_dir} directory")
+    except Exception as e:
+        print(f"❌ Error deleting {downloads_dir}: {e}")
+
+
+def show_download_stats(tracking_file: str = None) -> None:
+    if tracking_file is None:
+        tracking_file = str(get_data_path_manager().get_karaoke_tracking_path())
+    """Show statistics about current downloads."""
+    if not os.path.exists(tracking_file):
+        print("No tracking file found.")
+        return
+    
+    with open(tracking_file, 'r', encoding='utf-8') as f:
+        tracking = json.load(f)
+    
+    songs = tracking.get("songs", {})
+    total_songs = len(songs)
+    
+    if total_songs == 0:
+        print("No songs in tracking file.")
+        return
+    
+    # Count by status
+    status_counts = {}
+    channel_counts = {}
+    
+    for song_id, song_data in songs.items():
+        status = song_data.get("status", "UNKNOWN")
+        channel = song_data.get("channel_name", "UNKNOWN")
+        
+        status_counts[status] = status_counts.get(status, 0) + 1
+        channel_counts[channel] = channel_counts.get(channel, 0) + 1
+    
+    print(f"\n📊 Current Download Statistics:")
+    print(f"Total songs: {total_songs}")
+    print(f"\nBy Status:")
+    for status, count in status_counts.items():
+        print(f"  {status}: {count}")
+    
+    print(f"\nBy Channel:")
+    for channel, count in channel_counts.items():
+        print(f"  {channel}: {count}")
+
+
+def main():
+    """Main function to handle reset and re-download process."""
+    print("🔄 Karaoke Download Reset and Re-download Tool")
+    print("=" * 50)
+    
+    # Show current stats
+    print("\nCurrent download statistics:")
+    show_download_stats()
+    
+    # Ask user what they want to do
+    print("\nOptions:")
+    print("1. Reset tracking only (keep files)")
+    print("2. Reset tracking and delete all downloaded files")
+    print("3. Show current stats only")
+    print("4. Exit")
+    
+    choice = input("\nEnter your choice (1-4): ").strip()
+    
+    if choice == "1":
+        print("\n🔄 Resetting tracking only...")
+        reset_karaoke_tracking()
+        print("\n✅ Tracking reset complete!")
+        print("You can now re-download files with the new channel parser system.")
+        print("\nTo re-download, run:")
+        print("python download_karaoke.py --file data/channels.txt --limit 50")
+        
+    elif choice == "2":
+        print("\n🔄 Resetting tracking and deleting files...")
+        confirm = input("Are you sure you want to delete ALL downloaded files? (yes/no): ").strip().lower()
+        
+        if confirm == "yes":
+            reset_karaoke_tracking()
+            delete_downloaded_files()
+            print("\n✅ Reset complete! All tracking and files have been removed.")
+            print("You can now re-download files with the new channel parser system.")
+            print("\nTo re-download, run:")
+            print("python download_karaoke.py --file data/channels.txt --limit 50")
+        else:
+            print("Operation cancelled.")
+            
+    elif choice == "3":
+        print("\n📊 Current statistics:")
+        show_download_stats()
+        
+    elif choice == "4":
+        print("Exiting...")
+        
+    else:
+        print("Invalid choice. Please enter 1, 2, 3, or 4.")
+
+
+if __name__ == "__main__":
+    main() 
--- a/utilities/songlist_report.py
+++ b/utilities/songlist_report.py
@ -1,11 +1,15 @@
 import json
 from pathlib import Path

+from karaoke_downloader.data_path_manager import get_data_path_manager
+
 def normalize_title(title):
    normalized = title.replace("(Karaoke Version)", "").replace("(Karaoke)", "").strip()
    return " ".join(normalized.split()).lower()

-def load_songlist(songlist_path="data/songList.json"):
+def load_songlist(songlist_path=None):
+    if songlist_path is None:
+        songlist_path = str(get_data_path_manager().get_songlist_path())
    songlist_file = Path(songlist_path)
    if not songlist_file.exists():
        print(f"⚠️ Songlist file not found: {songlist_path}")
@ -24,14 +28,18 @@ def load_songlist(songlist_path="data/songList.json"):
                    })
    return all_songs

-def load_songlist_tracking(tracking_path="data/songlist_tracking.json"):
+def load_songlist_tracking(tracking_path=None):
+    if tracking_path is None:
+        tracking_path = str(get_data_path_manager().get_songlist_tracking_path())
    tracking_file = Path(tracking_path)
    if not tracking_file.exists():
        return {}
    with open(tracking_file, 'r', encoding='utf-8') as f:
        return json.load(f)

-def load_server_songs(songs_path="data/songs.json"):
+def load_server_songs(songs_path=None):
+    if songs_path is None:
+        songs_path = str(get_data_path_manager().get_songs_path())
    """Load the list of songs already available on the server."""
    songs_file = Path(songs_path)
    if not songs_file.exists():
Author	SHA1	Message	Date
Matt Bruce	1b6ac6454b	Signed-off-by: Matt Bruce <mbrucedogs@gmail.com>	2025-08-11 09:01:31 -05:00
Matt Bruce	e34c43a8f4	Signed-off-by: Matt Bruce <mbrucedogs@gmail.com>	2025-08-11 09:00:46 -05:00
Matt Bruce	6a796d8571	Signed-off-by: Matt Bruce <mbrucedogs@gmail.com>	2025-08-10 10:28:29 -05:00
mbrucedogs	b0eb76930a	Merge branch 'develop' of ssh://git@192.168.1.128:220/mbrucedogs/KaraokeVideoDownloader.git into develop	2025-08-05 16:31:03 -05:00
mbrucedogs	157f3a171b	Signed-off-by: mbrucedogs <mbrucedogs@gmail.com>	2025-08-05 16:30:20 -05:00
Matt Bruce	eb3642d652	mac support Signed-off-by: Matt Bruce <mbrucedogs@gmail.com>	2025-08-05 16:11:29 -05:00
mbrucedogs	a82c9741a5	Signed-off-by: mbrucedogs <mbrucedogs@gmail.com>	2025-08-05 15:38:39 -05:00
mbrucedogs	50b402ddec	Signed-off-by: mbrucedogs <mbrucedogs@gmail.com>	2025-07-29 09:07:31 -05:00
mbrucedogs	9f0787d00a	Signed-off-by: mbrucedogs <mbrucedogs@gmail.com>	2025-07-29 08:56:25 -05:00
mbrucedogs	409e66780c	Signed-off-by: mbrucedogs <mbrucedogs@gmail.com>	2025-07-29 08:49:43 -05:00
mbrucedogs	42e7a6a09c	Signed-off-by: mbrucedogs <mbrucedogs@gmail.com>	2025-07-29 08:45:12 -05:00
mbrucedogs	ec95b24a69	Signed-off-by: mbrucedogs <mbrucedogs@gmail.com>	2025-07-28 15:44:46 -05:00
mbrucedogs	21f8348419	Signed-off-by: mbrucedogs <mbrucedogs@gmail.com>	2025-07-28 14:23:19 -05:00
mbrucedogs	d18ac54476	Signed-off-by: mbrucedogs <mbrucedogs@gmail.com>	2025-07-28 14:09:07 -05:00
mbrucedogs	c48c1d3696	Signed-off-by: mbrucedogs <mbrucedogs@gmail.com>	2025-07-28 13:47:36 -05:00
mbrucedogs	273a748a1a	Signed-off-by: mbrucedogs <mbrucedogs@gmail.com>	2025-07-28 12:02:50 -05:00
mbrucedogs	5f3b00a39a	Signed-off-by: mbrucedogs <mbrucedogs@gmail.com>	2025-07-28 12:02:42 -05:00
mbrucedogs	24a6a37efd	Signed-off-by: mbrucedogs <mbrucedogs@gmail.com>	2025-07-28 09:30:09 -05:00
mbrucedogs	c864af7794	Signed-off-by: mbrucedogs <mbrucedogs@gmail.com>	2025-07-28 08:09:47 -05:00
mbrucedogs	613b64601a	Signed-off-by: mbrucedogs <mbrucedogs@gmail.com>	2025-07-28 07:51:40 -05:00
mbrucedogs	981f92ce95	Signed-off-by: mbrucedogs <mbrucedogs@gmail.com>	2025-07-28 05:42:07 -05:00
mbrucedogs	8dbc2fb8fd	Signed-off-by: mbrucedogs <mbrucedogs@gmail.com>	2025-07-27 22:52:17 -05:00
mbrucedogs	81b3d2d88c	Signed-off-by: mbrucedogs <mbrucedogs@gmail.com>	2025-07-27 22:49:35 -05:00
mbrucedogs	95a49bf39e	Signed-off-by: mbrucedogs <mbrucedogs@gmail.com>	2025-07-27 22:02:18 -05:00
mbrucedogs	c8f02ac3b4	Signed-off-by: mbrucedogs <mbrucedogs@gmail.com>	2025-07-27 21:58:52 -05:00
mbrucedogs	f914d54067	Signed-off-by: mbrucedogs <mbrucedogs@gmail.com>	2025-07-27 20:33:26 -05:00
mbrucedogs	ea07188739	Signed-off-by: mbrucedogs <mbrucedogs@gmail.com>	2025-07-27 19:47:05 -05:00
mbrucedogs	2c63bf809b	Signed-off-by: mbrucedogs <mbrucedogs@gmail.com>	2025-07-27 12:38:35 -05:00
mbrucedogs	7090fad1fd	Signed-off-by: mbrucedogs <mbrucedogs@gmail.com>	2025-07-27 12:01:39 -05:00
mbrucedogs	c78be7a7ad	Signed-off-by: mbrucedogs <mbrucedogs@gmail.com>	2025-07-27 11:40:57 -05:00
mbrucedogs	e6b2c9443c	Signed-off-by: mbrucedogs <mbrucedogs@gmail.com>	2025-07-27 10:56:19 -05:00