Signed-off-by: mbrucedogs <mbrucedogs@gmail.com>
This commit is contained in:
parent
409e66780c
commit
9f0787d00a
97
PRD.md
97
PRD.md
@ -415,6 +415,103 @@ The codebase has been comprehensively refactored to improve maintainability and
|
||||
- **Flexible control**: Use with limits, parallel processing, and channel targeting
|
||||
- **Clear progress feedback**: Real-time progress tracking for large downloads
|
||||
|
||||
## 🔧 Recent Bug Fixes & Improvements (v3.4.5)
|
||||
### **Unified Download Workflow Architecture**
|
||||
- **Unified execution pipeline**: All download modes now use the same execution workflow, eliminating inconsistencies and broken pipelines
|
||||
- **Consistent behavior**: All modes (--channel-focus, --all-videos, --songlist-only, --latest-per-channel) use identical download execution, progress tracking, and error handling
|
||||
- **Centralized download logic**: Single `execute_unified_download_workflow()` method handles all download execution
|
||||
- **Automatic parallel support**: All download modes automatically support `--parallel --workers N` without additional implementation
|
||||
- **Unified cache management**: Consistent progress tracking and resume functionality across all modes
|
||||
|
||||
### **Architecture Pattern for New Download Modes**
|
||||
When adding new download modes in the future, follow this pattern to ensure consistency:
|
||||
|
||||
#### **1. Download Plan Building (Mode-Specific)**
|
||||
Each download mode should build a download plan (list of videos to download) with this structure:
|
||||
```python
|
||||
download_plan = [
|
||||
{
|
||||
"video_id": "video_id",
|
||||
"artist": "artist_name",
|
||||
"title": "song_title",
|
||||
"filename": "sanitized_filename.mp4",
|
||||
"channel_name": "channel_name",
|
||||
"video_title": "original_video_title",
|
||||
"force_download": False
|
||||
}
|
||||
]
|
||||
```
|
||||
|
||||
#### **2. Unified Execution (Shared)**
|
||||
All modes should use the unified execution workflow:
|
||||
```python
|
||||
downloaded_count, success = self.execute_unified_download_workflow(
|
||||
download_plan=download_plan,
|
||||
cache_file=cache_file, # Optional, for progress tracking
|
||||
limit=limit, # Optional, for limiting downloads
|
||||
show_progress=True, # Optional, for progress display
|
||||
)
|
||||
```
|
||||
|
||||
#### **3. Execution Method Selection (Automatic)**
|
||||
The unified workflow automatically chooses execution method based on settings:
|
||||
- **Sequential**: Uses `DownloadPipeline` for single-threaded downloads
|
||||
- **Parallel**: Uses `ParallelDownloader` when `--parallel` is enabled
|
||||
|
||||
#### **4. Required Implementation Pattern**
|
||||
```python
|
||||
def download_new_mode(self, ...):
|
||||
"""New download mode implementation."""
|
||||
|
||||
# 1. Build download plan (mode-specific logic)
|
||||
download_plan = []
|
||||
for video in videos_to_download:
|
||||
download_plan.append({
|
||||
"video_id": video["id"],
|
||||
"artist": artist,
|
||||
"title": title,
|
||||
"filename": filename,
|
||||
"channel_name": channel_name,
|
||||
"video_title": video["title"],
|
||||
"force_download": force_download
|
||||
})
|
||||
|
||||
# 2. Create cache file (optional, for progress tracking)
|
||||
cache_file = get_download_plan_cache_file("new_mode", **plan_kwargs)
|
||||
save_plan_cache(cache_file, download_plan, [])
|
||||
|
||||
# 3. Use unified execution workflow
|
||||
downloaded_count, success = self.execute_unified_download_workflow(
|
||||
download_plan=download_plan,
|
||||
cache_file=cache_file,
|
||||
limit=limit,
|
||||
show_progress=True,
|
||||
)
|
||||
|
||||
return success
|
||||
```
|
||||
|
||||
### **Benefits of Unified Architecture**
|
||||
- **Consistency**: All modes behave identically for execution, progress tracking, and error handling
|
||||
- **Maintainability**: Changes to download execution only need to be made in one place
|
||||
- **Reliability**: Eliminates broken pipelines and inconsistent behavior between modes
|
||||
- **Extensibility**: New modes automatically get all existing features (parallel downloads, progress tracking, etc.)
|
||||
- **Testing**: Easier to test since all modes use the same execution logic
|
||||
|
||||
### **What Was Fixed**
|
||||
- **Broken Pipeline**: Previously, different modes used different execution paths, leading to inconsistencies
|
||||
- **Missing Method**: Added missing `download_latest_per_channel()` method that was referenced in CLI but not implemented
|
||||
- **Code Duplication**: Eliminated duplicate download execution logic across different modes
|
||||
- **Inconsistent Behavior**: All modes now have identical progress tracking, error handling, and cache management
|
||||
|
||||
### **Future Development Guidelines**
|
||||
1. **NEVER implement custom download execution logic** in new download modes
|
||||
2. **ALWAYS use `execute_unified_download_workflow()`** for download execution
|
||||
3. **Focus on download plan building** - that's where mode-specific logic belongs
|
||||
4. **Use the standard download plan structure** for consistency
|
||||
5. **Implement cache file handling** for progress tracking and resume functionality
|
||||
6. **Test with both sequential and parallel modes** to ensure compatibility
|
||||
|
||||
---
|
||||
|
||||
## 🚀 Future Enhancements
|
||||
|
||||
118
README.md
118
README.md
@ -49,47 +49,82 @@ The codebase has been comprehensively refactored into a modular architecture wit
|
||||
- **`tracking_cli.py`**: Tracking management CLI
|
||||
|
||||
### New Utility Modules (v3.3):
|
||||
- **`parallel_downloader.py`**: Parallel download management with thread-safe operations
|
||||
- `ParallelDownloader` class: Manages concurrent downloads with configurable workers
|
||||
- `DownloadTask` and `DownloadResult` dataclasses: Structured task and result management
|
||||
- Thread-safe progress tracking and error handling
|
||||
- Automatic retry mechanism for failed downloads
|
||||
- **`file_utils.py`**: Centralized file operations, filename sanitization, and file validation
|
||||
- `sanitize_filename()`: Create safe filenames from artist/title
|
||||
- `generate_possible_filenames()`: Generate filename patterns for different modes
|
||||
- `check_file_exists_with_patterns()`: Check for existing files using multiple patterns
|
||||
- `is_valid_mp4_file()`: Validate MP4 files with header checking
|
||||
- `cleanup_temp_files()`: Remove temporary yt-dlp files
|
||||
- `ensure_directory_exists()`: Safe directory creation
|
||||
- **`song_validator.py`**: Centralized song validation logic for checking if songs should be downloaded
|
||||
|
||||
- **`song_validator.py`**: Centralized song validation logic
|
||||
- `SongValidator` class: Unified logic for checking if songs should be downloaded
|
||||
- `should_skip_song()`: Comprehensive validation with multiple criteria
|
||||
- `mark_song_failed()`: Consistent failure tracking
|
||||
- `handle_download_failure()`: Standardized error handling
|
||||
### **Unified Download Workflow (v3.4.5)**
|
||||
- **`execute_unified_download_workflow()`**: Centralized download execution that all modes use
|
||||
- **`_execute_sequential_downloads()`**: Sequential download execution using DownloadPipeline
|
||||
- **`_execute_parallel_downloads()`**: Parallel download execution using ParallelDownloader
|
||||
|
||||
- **Enhanced `config_manager.py`**: Robust configuration management with dataclasses
|
||||
- `ConfigManager` class: Type-safe configuration loading and caching
|
||||
- `DownloadSettings`, `FolderStructure`, `LoggingConfig` dataclasses
|
||||
- Configuration validation and merging with defaults
|
||||
- Dynamic resolution updates
|
||||
|
||||
### Benefits:
|
||||
### **Benefits of Enhanced Modular Architecture:**
|
||||
- **Single Responsibility**: Each module has a focused purpose
|
||||
- **Centralized Utilities**: Common operations (file operations, song validation, yt-dlp commands, error handling) are centralized
|
||||
- **Reduced Duplication**: Eliminated ~150 lines of code duplication across modules
|
||||
- **Testability**: Individual components can be tested separately
|
||||
- **Maintainability**: Easier to find and fix issues
|
||||
- **Reusability**: Components can be used independently
|
||||
- **Robustness**: Better error handling and interruption recovery
|
||||
- **Consistency**: Standardized error messages and processing pipelines
|
||||
- **Maintainability**: Changes isolated to specific modules
|
||||
- **Testability**: Modular components can be tested independently
|
||||
- **Type Safety**: Comprehensive type hints across all new modules
|
||||
- **Unified Execution**: All download modes use the same execution pipeline for consistency
|
||||
|
||||
## 🔧 Development Guidelines
|
||||
|
||||
### **Adding New Download Modes**
|
||||
When adding new download modes, follow the unified workflow pattern to ensure consistency:
|
||||
|
||||
#### **1. Build Download Plan (Mode-Specific)**
|
||||
```python
|
||||
def download_new_mode(self, ...):
|
||||
# Build download plan with standard structure
|
||||
download_plan = []
|
||||
for video in videos_to_download:
|
||||
download_plan.append({
|
||||
"video_id": video["id"],
|
||||
"artist": artist,
|
||||
"title": title,
|
||||
"filename": filename,
|
||||
"channel_name": channel_name,
|
||||
"video_title": video["title"],
|
||||
"force_download": force_download
|
||||
})
|
||||
|
||||
# Use unified execution workflow
|
||||
downloaded_count, success = self.execute_unified_download_workflow(
|
||||
download_plan=download_plan,
|
||||
cache_file=cache_file,
|
||||
limit=limit,
|
||||
show_progress=True,
|
||||
)
|
||||
|
||||
return success
|
||||
```
|
||||
|
||||
#### **2. Key Principles**
|
||||
- **NEVER implement custom download execution logic** - always use `execute_unified_download_workflow()`
|
||||
- **Focus on download plan building** - that's where mode-specific logic belongs
|
||||
- **Use the standard download plan structure** for consistency
|
||||
- **Implement cache file handling** for progress tracking and resume functionality
|
||||
- **Test with both sequential and parallel modes** to ensure compatibility
|
||||
|
||||
#### **3. Benefits of Unified Architecture**
|
||||
- **Consistency**: All modes behave identically for execution, progress tracking, and error handling
|
||||
- **Automatic Features**: New modes automatically get parallel downloads, progress tracking, and cache management
|
||||
- **Maintainability**: Changes to download execution only need to be made in one place
|
||||
- **Reliability**: Eliminates broken pipelines and inconsistent behavior between modes
|
||||
|
||||
## 🔧 Recent Improvements (v3.4.1)
|
||||
### **Enhanced Fuzzy Matching**
|
||||
- **Improved video title parsing**: The `extract_artist_title` function now handles multiple title formats:
|
||||
- `"Title Karaoke | Artist Karaoke Version"` → Artist: "38 Special", Title: "Hold On Loosely"
|
||||
- `"Title Artist KARAOKE"` → Attempts to extract artist from complex titles
|
||||
- `"Artist - Title"` → Standard format (unchanged)
|
||||
- **Consolidated parsing logic**: All modules now use the same `extract_artist_title` function from `fuzzy_matcher.py`
|
||||
- **Better matching accuracy**: Reduced false negatives for songs with non-standard title formats
|
||||
- **Improved title parsing**: Enhanced `extract_artist_title` function to handle multiple video title formats
|
||||
- **Better matching accuracy**: Can now parse titles like "Hold On Loosely Karaoke | 38 Special Karaoke Version"
|
||||
- **Consistent parsing**: All modules now use the same parsing logic from `fuzzy_matcher.py`
|
||||
- **Reduced false negatives**: Songs that previously couldn't be matched due to title format differences now have a higher chance of being found
|
||||
|
||||
### **Fixed Import Conflicts**
|
||||
- **Resolved import conflicts**: Updated modules to use the enhanced `extract_artist_title` from `fuzzy_matcher.py`
|
||||
- **Consistent behavior**: All parts of the system use the same parsing logic
|
||||
- **Cleaner codebase**: Eliminated duplicate code and import conflicts
|
||||
|
||||
### **Fixed --limit Parameter**
|
||||
- **Correct limit application**: The `--limit` parameter now properly limits the scanning phase, not just downloads
|
||||
@ -101,6 +136,27 @@ The codebase has been comprehensively refactored into a modular architecture wit
|
||||
- **Fixed import conflicts**: Resolved inconsistencies between different parsing implementations
|
||||
- **Single source of truth**: All title parsing logic is now centralized in `fuzzy_matcher.py`
|
||||
|
||||
## 🔧 Recent Improvements (v3.4.5)
|
||||
### **Unified Download Workflow Architecture**
|
||||
- **Unified execution pipeline**: All download modes now use the same execution workflow, eliminating inconsistencies and broken pipelines
|
||||
- **Consistent behavior**: All modes (--channel-focus, --all-videos, --songlist-only, --latest-per-channel) use identical download execution, progress tracking, and error handling
|
||||
- **Centralized download logic**: Single `execute_unified_download_workflow()` method handles all download execution
|
||||
- **Automatic parallel support**: All download modes automatically support `--parallel --workers N` without additional implementation
|
||||
- **Unified cache management**: Consistent progress tracking and resume functionality across all modes
|
||||
|
||||
### **What Was Fixed**
|
||||
- **Broken Pipeline**: Previously, different modes used different execution paths, leading to inconsistencies
|
||||
- **Missing Method**: Added missing `download_latest_per_channel()` method that was referenced in CLI but not implemented
|
||||
- **Code Duplication**: Eliminated duplicate download execution logic across different modes
|
||||
- **Inconsistent Behavior**: All modes now have identical progress tracking, error handling, and cache management
|
||||
|
||||
### **Benefits**
|
||||
- ✅ **Consistency**: All modes behave identically for execution, progress tracking, and error handling
|
||||
- ✅ **Maintainability**: Changes to download execution only need to be made in one place
|
||||
- ✅ **Reliability**: Eliminates broken pipelines and inconsistent behavior between modes
|
||||
- ✅ **Extensibility**: New modes automatically get all existing features (parallel downloads, progress tracking, etc.)
|
||||
- ✅ **Testing**: Easier to test since all modes use the same execution logic
|
||||
|
||||
## 🛡️ Duplicate File Prevention & Filename Consistency (v3.4.2)
|
||||
### **Duplicate File Prevention**
|
||||
- **Enhanced file existence checking**: Now detects files with `(2)`, `(3)`, etc. suffixes that yt-dlp creates
|
||||
|
||||
@ -443,43 +443,16 @@ class KaraokeDownloader:
|
||||
print(f" 🎯 Limit applied: {limit} videos")
|
||||
print(f" 📁 Output directory: downloads/{channel_name}/")
|
||||
print(f" 💾 Download plan cached to: {cache_file.name}")
|
||||
print(f"\n🎬 Starting downloads...")
|
||||
|
||||
# Download videos using the download pipeline
|
||||
pipeline = DownloadPipeline(
|
||||
yt_dlp_path=str(self.yt_dlp_path),
|
||||
config=self.config,
|
||||
downloads_dir=self.downloads_dir,
|
||||
songlist_tracking=self.songlist_tracking,
|
||||
tracker=self.tracker,
|
||||
# Use unified download workflow
|
||||
downloaded_count, success = self.execute_unified_download_workflow(
|
||||
download_plan=download_plan,
|
||||
cache_file=cache_file,
|
||||
limit=limit,
|
||||
show_progress=True,
|
||||
)
|
||||
|
||||
success_count = 0
|
||||
total_to_download = len(download_plan)
|
||||
for i, plan_item in enumerate(download_plan[:], 1): # Use slice to create a copy for iteration
|
||||
print(f"⬇️ Downloading {i}/{total_to_download}: {plan_item['artist']} - {plan_item['title']}")
|
||||
|
||||
if pipeline.execute_pipeline(
|
||||
video_id=plan_item["video_id"],
|
||||
artist=plan_item["artist"],
|
||||
title=plan_item["title"],
|
||||
channel_name=plan_item["channel_name"],
|
||||
video_title=plan_item["video_title"],
|
||||
):
|
||||
success_count += 1
|
||||
# Remove completed item from cache
|
||||
download_plan.remove(plan_item) # Remove the current item
|
||||
if download_plan: # If there are still items left
|
||||
save_plan_cache(cache_file, download_plan, [])
|
||||
print(f"🗑️ Removed completed item from download plan. {len(download_plan)} items remaining.")
|
||||
else:
|
||||
# All downloads completed, delete the cache file
|
||||
from karaoke_downloader.cache_manager import delete_plan_cache
|
||||
delete_plan_cache(cache_file)
|
||||
print("🗑️ All downloads completed, deleted download plan cache.")
|
||||
|
||||
print(f"\n🎉 Download complete! {success_count}/{total_to_download} videos downloaded successfully")
|
||||
return success_count > 0
|
||||
return success
|
||||
|
||||
def download_songlist_across_channels(
|
||||
self,
|
||||
@ -718,14 +691,134 @@ class KaraokeDownloader:
|
||||
print(f" ...and {len(unmatched)-DEFAULT_DISPLAY_LIMIT} more.")
|
||||
|
||||
# --- Download phase ---
|
||||
downloaded_count, success = self.execute_download_plan_parallel(
|
||||
downloaded_count, success = self.execute_unified_download_workflow(
|
||||
download_plan=download_plan,
|
||||
unmatched=unmatched,
|
||||
cache_file=cache_file,
|
||||
limit=limit,
|
||||
)
|
||||
return success
|
||||
|
||||
def download_latest_per_channel(
|
||||
self,
|
||||
channel_urls,
|
||||
limit=None,
|
||||
force_refresh_download_plan=False,
|
||||
fuzzy_match=False,
|
||||
fuzzy_threshold=DEFAULT_FUZZY_THRESHOLD,
|
||||
force_download=False,
|
||||
):
|
||||
"""
|
||||
Download the latest N videos from each channel.
|
||||
|
||||
Args:
|
||||
channel_urls: List of channel URLs to process
|
||||
limit: Number of latest videos to download from each channel
|
||||
force_refresh_download_plan: Force refresh the download plan cache
|
||||
fuzzy_match: Whether to use fuzzy matching
|
||||
fuzzy_threshold: Threshold for fuzzy matching
|
||||
force_download: Force download regardless of existing files
|
||||
|
||||
Returns:
|
||||
bool: True if successful, False otherwise
|
||||
"""
|
||||
print(f"\n🎬 Downloading latest {limit} videos from {len(channel_urls)} channels")
|
||||
|
||||
# Build download plan for latest videos from each channel
|
||||
download_plan = []
|
||||
total_videos_found = 0
|
||||
|
||||
for i, channel_url in enumerate(channel_urls, 1):
|
||||
print(f"\n🚦 Processing channel {i}/{len(channel_urls)}: {channel_url}")
|
||||
|
||||
# Get channel info
|
||||
channel_name, channel_id = get_channel_info(channel_url)
|
||||
print(f" ✅ Channel: {channel_name}")
|
||||
|
||||
# Get videos from channel
|
||||
available_videos = self.tracker.get_channel_video_list(
|
||||
channel_url,
|
||||
str(self.yt_dlp_path),
|
||||
force_refresh=False
|
||||
)
|
||||
|
||||
if not available_videos:
|
||||
print(f" ⚠️ No videos found for {channel_name}")
|
||||
continue
|
||||
|
||||
print(f" 📊 Found {len(available_videos)} videos")
|
||||
|
||||
# Take the latest N videos (they're already sorted by date)
|
||||
latest_videos = available_videos[:limit] if limit else available_videos
|
||||
print(f" 🎯 Processing latest {len(latest_videos)} videos")
|
||||
|
||||
# Process each video
|
||||
for video in latest_videos:
|
||||
video_title = video["title"]
|
||||
video_id = video["id"]
|
||||
|
||||
# Extract artist and title
|
||||
artist, extracted_title = self.channel_parser.extract_artist_title(video_title, channel_name)
|
||||
if not artist and not extracted_title:
|
||||
# Fallback: use the full title
|
||||
artist = ""
|
||||
extracted_title = video_title
|
||||
|
||||
# Create filename
|
||||
filename = sanitize_filename(artist, extracted_title)
|
||||
|
||||
# Add to download plan
|
||||
download_plan.append({
|
||||
"video_id": video_id,
|
||||
"artist": artist,
|
||||
"title": extracted_title,
|
||||
"filename": filename,
|
||||
"channel_name": channel_name,
|
||||
"video_title": video_title,
|
||||
"force_download": force_download
|
||||
})
|
||||
|
||||
total_videos_found += 1
|
||||
|
||||
print(f"\n📋 Download plan created: {total_videos_found} videos from {len(channel_urls)} channels")
|
||||
|
||||
if not download_plan:
|
||||
print("❌ No videos to download")
|
||||
return False
|
||||
|
||||
# Create cache file for progress tracking
|
||||
import hashlib
|
||||
from karaoke_downloader.cache_manager import get_download_plan_cache_file, save_plan_cache
|
||||
|
||||
plan_kwargs = {
|
||||
"mode": "latest_per_channel",
|
||||
"channels": len(channel_urls),
|
||||
"limit_per_channel": limit,
|
||||
"force_download": force_download,
|
||||
}
|
||||
|
||||
# Add channel URLs hash to ensure same channels = same cache
|
||||
channels_hash = hashlib.md5(
|
||||
"|".join(sorted(channel_urls)).encode()
|
||||
).hexdigest()[:8]
|
||||
plan_kwargs["channels_hash"] = channels_hash
|
||||
|
||||
cache_file = get_download_plan_cache_file("latest_per_channel", **plan_kwargs)
|
||||
|
||||
# Save the plan to cache
|
||||
save_plan_cache(cache_file, download_plan, []) # No unmatched for latest-per-channel mode
|
||||
|
||||
print(f"💾 Download plan cached to: {cache_file.name}")
|
||||
|
||||
# Use unified download workflow
|
||||
downloaded_count, success = self.execute_unified_download_workflow(
|
||||
download_plan=download_plan,
|
||||
cache_file=cache_file,
|
||||
limit=None, # Limit already applied during plan building
|
||||
show_progress=True,
|
||||
)
|
||||
|
||||
return success
|
||||
|
||||
def _process_videos_for_download(self, available_videos, channel_name, force_refresh=False, fuzzy_match=False, fuzzy_threshold=DEFAULT_FUZZY_THRESHOLD, force_download=False):
|
||||
"""Process videos for download (used for both manual and regular channels)."""
|
||||
songlist = load_songlist(self.songlist_file_path)
|
||||
@ -841,49 +934,39 @@ class KaraokeDownloader:
|
||||
matches = matches[:limit]
|
||||
print(f"🎯 Limiting to {len(matches)} downloads")
|
||||
|
||||
# Download matched videos
|
||||
success_count = 0
|
||||
for i, match in enumerate(matches, 1):
|
||||
# Convert matches to a download plan
|
||||
download_plan = []
|
||||
for match in matches:
|
||||
video = match["video"]
|
||||
song = match["song"]
|
||||
artist = match["artist"]
|
||||
title = match["title"]
|
||||
video_id = video["id"]
|
||||
|
||||
print(f"\n⬇️ Downloading {i}/{len(matches)}: {artist} - {title}")
|
||||
print(f" 🎬 Video: {video['title']} ({channel_name})")
|
||||
if match["match_type"] == "fuzzy":
|
||||
print(f" 🎯 Match Score: {match['match_score']:.1f}%")
|
||||
|
||||
# Create filename
|
||||
filename = sanitize_filename(artist, title)
|
||||
output_path = self.downloads_dir / channel_name / filename
|
||||
|
||||
# Use the download pipeline
|
||||
pipeline = DownloadPipeline(
|
||||
yt_dlp_path=str(self.yt_dlp_path),
|
||||
config=self.config,
|
||||
downloads_dir=self.downloads_dir,
|
||||
songlist_tracking=self.songlist_tracking,
|
||||
tracker=self.tracker,
|
||||
)
|
||||
|
||||
success = pipeline.execute_pipeline(
|
||||
video_id=video_id,
|
||||
artist=artist,
|
||||
title=title,
|
||||
channel_name=channel_name,
|
||||
video_title=video["title"]
|
||||
)
|
||||
|
||||
if success:
|
||||
success_count += 1
|
||||
print(f"✅ Successfully downloaded: {artist} - {title}")
|
||||
else:
|
||||
print(f"❌ Failed to download: {artist} - {title}")
|
||||
# Add to download plan
|
||||
download_plan.append({
|
||||
"video_id": video_id,
|
||||
"artist": artist,
|
||||
"title": title,
|
||||
"filename": filename,
|
||||
"channel_name": channel_name,
|
||||
"video_title": video["title"],
|
||||
"force_download": force_download
|
||||
})
|
||||
|
||||
print(f"\n🎉 Download complete! {success_count}/{len(matches)} videos downloaded successfully")
|
||||
return success_count > 0
|
||||
# Use the unified download workflow
|
||||
downloaded_count, success = self.execute_unified_download_workflow(
|
||||
download_plan=download_plan,
|
||||
cache_file=None, # No specific cache file for this mode
|
||||
limit=limit,
|
||||
show_progress=True,
|
||||
)
|
||||
|
||||
return success
|
||||
|
||||
def _download_single_video(self, video, channel_name, filename, force_download=False):
|
||||
"""Download a single video using the download pipeline."""
|
||||
@ -923,6 +1006,138 @@ class KaraokeDownloader:
|
||||
|
||||
return success
|
||||
|
||||
def execute_unified_download_workflow(
|
||||
self,
|
||||
download_plan,
|
||||
cache_file=None,
|
||||
limit=None,
|
||||
show_progress=True,
|
||||
):
|
||||
"""
|
||||
Unified download workflow that all download modes use.
|
||||
|
||||
Args:
|
||||
download_plan: List of download items with video_id, artist, title, channel_name, video_title
|
||||
cache_file: Optional cache file for progress tracking
|
||||
limit: Optional limit on number of downloads
|
||||
show_progress: Whether to show progress information
|
||||
|
||||
Returns:
|
||||
tuple: (downloaded_count, success)
|
||||
"""
|
||||
if not download_plan:
|
||||
print("📋 No videos to download in plan")
|
||||
return 0, True
|
||||
|
||||
total_to_download = len(download_plan)
|
||||
if limit:
|
||||
total_to_download = min(limit, total_to_download)
|
||||
download_plan = download_plan[:limit]
|
||||
|
||||
if show_progress:
|
||||
print(f"\n🎬 Starting downloads: {total_to_download} videos")
|
||||
print(f" 📁 Output directory: downloads/")
|
||||
if cache_file:
|
||||
print(f" 💾 Progress tracking: {cache_file.name}")
|
||||
|
||||
# Choose execution method based on parallel settings
|
||||
if self.enable_parallel_downloads:
|
||||
return self._execute_parallel_downloads(download_plan, cache_file, show_progress)
|
||||
else:
|
||||
return self._execute_sequential_downloads(download_plan, cache_file, show_progress)
|
||||
|
||||
def _execute_sequential_downloads(self, download_plan, cache_file, show_progress):
|
||||
"""Execute downloads sequentially using the download pipeline."""
|
||||
success_count = 0
|
||||
total_to_download = len(download_plan)
|
||||
|
||||
# Create download pipeline
|
||||
pipeline = DownloadPipeline(
|
||||
yt_dlp_path=str(self.yt_dlp_path),
|
||||
config=self.config,
|
||||
downloads_dir=self.downloads_dir,
|
||||
songlist_tracking=self.songlist_tracking,
|
||||
tracker=self.tracker,
|
||||
)
|
||||
|
||||
for i, plan_item in enumerate(download_plan, 1):
|
||||
if show_progress:
|
||||
print(f"\n⬇️ Downloading {i}/{total_to_download}: {plan_item['artist']} - {plan_item['title']}")
|
||||
print(f" 🎬 Video: {plan_item['video_title']} ({plan_item['channel_name']})")
|
||||
|
||||
success = pipeline.execute_pipeline(
|
||||
video_id=plan_item["video_id"],
|
||||
artist=plan_item["artist"],
|
||||
title=plan_item["title"],
|
||||
channel_name=plan_item["channel_name"],
|
||||
video_title=plan_item["video_title"],
|
||||
)
|
||||
|
||||
if success:
|
||||
success_count += 1
|
||||
if show_progress:
|
||||
print(f"✅ Successfully downloaded: {plan_item['artist']} - {plan_item['title']}")
|
||||
else:
|
||||
if show_progress:
|
||||
print(f"❌ Failed to download: {plan_item['artist']} - {plan_item['title']}")
|
||||
|
||||
# Update cache if provided
|
||||
if cache_file:
|
||||
# Remove completed item from plan and update cache
|
||||
download_plan.remove(plan_item)
|
||||
from karaoke_downloader.cache_manager import save_plan_cache
|
||||
save_plan_cache(cache_file, download_plan, []) # No unmatched for unified workflow
|
||||
|
||||
if not download_plan: # All downloads completed
|
||||
from karaoke_downloader.cache_manager import delete_plan_cache
|
||||
delete_plan_cache(cache_file)
|
||||
if show_progress:
|
||||
print("🗑️ All downloads completed, deleted download plan cache.")
|
||||
|
||||
if show_progress:
|
||||
print(f"\n🎉 Download complete! {success_count}/{total_to_download} videos downloaded successfully")
|
||||
|
||||
return success_count, success_count > 0
|
||||
|
||||
def _execute_parallel_downloads(self, download_plan, cache_file, show_progress):
|
||||
"""Execute downloads in parallel using the parallel downloader."""
|
||||
from karaoke_downloader.parallel_downloader import create_parallel_downloader
|
||||
|
||||
# Create parallel downloader
|
||||
parallel_downloader = create_parallel_downloader(
|
||||
yt_dlp_path=str(self.yt_dlp_path),
|
||||
config=self.config,
|
||||
downloads_dir=self.downloads_dir,
|
||||
max_workers=self.parallel_workers,
|
||||
songlist_tracking=self.songlist_tracking,
|
||||
tracker=self.tracker,
|
||||
)
|
||||
|
||||
# Convert download plan to tasks
|
||||
tasks = []
|
||||
for item in download_plan:
|
||||
from karaoke_downloader.parallel_downloader import DownloadTask
|
||||
task = DownloadTask(
|
||||
video_id=item["video_id"],
|
||||
artist=item["artist"],
|
||||
title=item["title"],
|
||||
channel_name=item["channel_name"],
|
||||
video_title=item["video_title"],
|
||||
)
|
||||
tasks.append(task)
|
||||
|
||||
# Execute parallel downloads
|
||||
results = parallel_downloader.execute_downloads(tasks)
|
||||
|
||||
# Count successes
|
||||
success_count = sum(1 for result in results if result.success)
|
||||
total_to_download = len(tasks)
|
||||
|
||||
if show_progress:
|
||||
print(f"\n🎉 Parallel download complete! {success_count}/{total_to_download} videos downloaded successfully")
|
||||
|
||||
return success_count, success_count > 0
|
||||
|
||||
|
||||
def reset_songlist_all():
|
||||
"""Delete all files tracked in songlist_tracking.json, clear songlist_tracking.json, and remove songlist songs from karaoke_tracking.json."""
|
||||
|
||||
Loading…
Reference in New Issue
Block a user