Signed-off-by: mbrucedogs <mbrucedogs@gmail.com>

This commit is contained in:
mbrucedogs 2025-07-25 12:34:01 -05:00
parent 08d7d259f3
commit aa6608f4a5
9 changed files with 16503 additions and 348 deletions

View File

@ -61,6 +61,8 @@ The codebase has been comprehensively refactored into a modular architecture wit
## 🚀 Quick Start
> **💡 Pro Tip**: For a complete list of all available commands, see `commands.txt` - you can copy/paste any command directly into your terminal!
### Download a Channel
```bash
python download_karaoke.py https://www.youtube.com/@SingKingKaraoke/videos
@ -71,6 +73,11 @@ python download_karaoke.py https://www.youtube.com/@SingKingKaraoke/videos
python download_karaoke.py --songlist-only --limit 5
```
### Focus on Specific Playlists by Title
```bash
python download_karaoke.py --songlist-focus "2025 - Apple Top 50" "2024 - Billboard Hot 100"
```
### Download with Fuzzy Matching
```bash
python download_karaoke.py --songlist-only --limit 10 --fuzzy-match --fuzzy-threshold 85
@ -126,13 +133,26 @@ python download_karaoke.py --clear-cache all
- Place your prioritized song list in `data/songList.json` (see example format below).
- The tool will match and prioritize these songs across all available channel videos.
- Use `--songlist-only` to download only these songs, or `--songlist-priority` to prioritize them in the queue.
- Use `--songlist-focus` to download only songs from specific playlists by title (e.g., `--songlist-focus "2025 - Apple Top 50" "2024 - Billboard Hot 100"`).
- Download progress for the songlist is tracked globally in `data/songlist_tracking.json`.
#### Example `data/songList.json`
```json
[
{ "artist": "Taylor Swift", "title": "Cruel Summer" },
{ "artist": "Billie Eilish", "title": "Happier Than Ever" }
{
"title": "2025 - Apple Top 50",
"songs": [
{ "artist": "Kendrick Lamar & SZA", "title": "luther", "position": 1 },
{ "artist": "Kendrick Lamar", "title": "Not Like Us", "position": 2 }
]
},
{
"title": "2024 - Billboard Hot 100",
"songs": [
{ "artist": "Taylor Swift", "title": "Cruel Summer", "position": 1 },
{ "artist": "Billie Eilish", "title": "Happier Than Ever", "position": 2 }
]
}
]
```
@ -145,6 +165,7 @@ python download_karaoke.py --clear-cache all
## 📂 Folder Structure
```
KaroakeVideoDownloader/
├── commands.txt # Complete CLI commands reference (copy/paste ready)
├── karaoke_downloader/ # All core Python code and utilities
│ ├── downloader.py # Main orchestrator and CLI interface
│ ├── cli.py # CLI entry point
@ -185,9 +206,14 @@ KaroakeVideoDownloader/
```
## 🚦 CLI Options
> **📋 Complete Command Reference**: See `commands.txt` for all available commands with examples - perfect for copy/paste!
### Key Options:
- `--file <data/channels.txt>`: Download from a list of channels (optional, defaults to data/channels.txt for songlist modes)
- `--songlist-priority`: Prioritize songlist songs in download queue
- `--songlist-only`: Download only songs from the songlist
- `--songlist-focus <PLAYLIST_TITLE1> <PLAYLIST_TITLE2>...`: Focus on specific playlists by title (e.g., `--songlist-focus "2025 - Apple Top 50" "2024 - Billboard Hot 100"`)
- `--songlist-status`: Show songlist download progress
- `--limit <N>`: Limit number of downloads (enables fast mode with early exit)
- `--resolution <720p|1080p|...>`: Override resolution
@ -201,6 +227,9 @@ KaroakeVideoDownloader/
- `--fuzzy-threshold <N>`: Fuzzy match threshold (0-100, default 85)
## 📝 Example Usage
> **💡 For complete examples**: See `commands.txt` for all command variations with explanations!
```bash
# Fast mode with fuzzy matching (no need to specify --file)
python download_karaoke.py --songlist-only --limit 10 --fuzzy-match --fuzzy-threshold 85
@ -228,6 +257,20 @@ python download_karaoke.py --clear-server-duplicates
- All options are in `data/config.json` (format, resolution, metadata, etc.)
- You can edit this file or use CLI flags to override
## 📋 Command Reference File
**`commands.txt`** contains a comprehensive list of all CLI commands with explanations. This file is designed for easy copy/paste usage and includes:
- All basic download commands
- Songlist operations
- Latest-per-channel downloads
- Cache and tracking management
- Reset and cleanup operations
- Advanced combinations
- Common workflows
- Troubleshooting commands
> **🔄 Maintenance Note**: The `commands.txt` file should be kept up to date with any CLI changes. When adding new command-line options or modifying existing ones, update this file to reflect all available commands and their usage.
## 🔧 Refactoring Improvements (v3.2)
The codebase has been comprehensively refactored to improve maintainability and reduce code duplication:

188
commands.txt Normal file
View File

@ -0,0 +1,188 @@
# 🎤 Karaoke Video Downloader - CLI Commands Reference
# Copy and paste these commands into your terminal
# Updated: v3.2 (includes all refactoring improvements)
## 📥 BASIC DOWNLOADS
# Download a single channel
python download_karaoke.py https://www.youtube.com/@SingKingKaraoke/videos
# Download from a file containing multiple channel URLs
python download_karaoke.py --file data/channels.txt
# Download with custom resolution (480p, 720p, 1080p, 1440p, 2160p)
python download_karaoke.py --resolution 1080p https://www.youtube.com/@SingKingKaraoke/videos
# Limit number of downloads (fast mode with early exit)
python download_karaoke.py --limit 10 https://www.youtube.com/@SingKingKaraoke/videos
## 🎵 SONGLIST OPERATIONS
# Download only songs from your songlist (uses data/channels.txt by default)
python download_karaoke.py --songlist-only
# Download only songlist songs with limit
python download_karaoke.py --songlist-only --limit 5
# Download songlist songs with fuzzy matching (more flexible matching)
python download_karaoke.py --songlist-only --fuzzy-match --limit 10
# Download songlist songs with custom fuzzy threshold (0-100, default 90)
python download_karaoke.py --songlist-only --fuzzy-match --fuzzy-threshold 85 --limit 10
# Focus on specific playlists by title (download only songs from these playlists)
python download_karaoke.py --songlist-focus "2025 - Apple Top 50" "2024 - Billboard Hot 100"
# Focus on specific playlists with fuzzy matching
python download_karaoke.py --songlist-focus "2025 - Apple Top 50" --fuzzy-match --fuzzy-threshold 85
# Focus on specific playlists with limit
python download_karaoke.py --songlist-focus "2025 - Apple Top 50" --limit 5
# Prioritize songlist songs in download queue (default behavior)
python download_karaoke.py --songlist-priority https://www.youtube.com/@SingKingKaraoke/videos
# Disable songlist prioritization
python download_karaoke.py --no-songlist-priority https://www.youtube.com/@SingKingKaraoke/videos
# Show songlist download status and statistics
python download_karaoke.py --songlist-status
## 🗂️ LATEST-PER-CHANNEL DOWNLOADS
# Download latest 5 videos from each channel
python download_karaoke.py --latest-per-channel --limit 5
# Download latest videos with fuzzy matching
python download_karaoke.py --latest-per-channel --limit 5 --fuzzy-match --fuzzy-threshold 85
# Download latest videos from specific channels file
python download_karaoke.py --latest-per-channel --limit 5 --file data/channels.txt
## 🔄 CACHE & TRACKING MANAGEMENT
# Show download status and statistics
python download_karaoke.py --status
# Show channel cache information
python download_karaoke.py --cache-info
# Clear cache for a specific channel
python download_karaoke.py --clear-cache SingKingKaraoke
# Clear cache for all channels
python download_karaoke.py --clear-cache all
# Set cache duration (in hours)
python download_karaoke.py --cache-duration 48
# Force refresh channel cache (ignore cached data)
python download_karaoke.py --refresh https://www.youtube.com/@SingKingKaraoke/videos
# Force refresh download plan cache (re-scan all channels for matches)
python download_karaoke.py --force-download-plan --songlist-only
# Clear server duplicates tracking (allows re-checking songs against server)
python download_karaoke.py --clear-server-duplicates
## 🧹 RESET & CLEANUP OPERATIONS
# Reset all tracking and files for a specific channel
python download_karaoke.py --reset-channel SingKingKaraoke
# Reset channel and also reset songlist songs for this channel
python download_karaoke.py --reset-channel SingKingKaraoke --reset-songlist
# Reset all songlist tracking and delete all songlist-downloaded files (GLOBAL)
python download_karaoke.py --reset-songlist-all
# Clean up orphaned tracking entries
python download_karaoke.py --cleanup
## 📊 REPORTS & SYNC
# Generate detailed report for a specific playlist
python download_karaoke.py --report PLAYLIST_ID
# Only sync playlist without downloading (update tracking)
python download_karaoke.py --sync https://www.youtube.com/@SingKingKaraoke/videos
# Show version information
python download_karaoke.py --version
## 🎯 ADVANCED COMBINATIONS
# Fast songlist download with fuzzy matching and high quality
python download_karaoke.py --songlist-only --limit 20 --fuzzy-match --fuzzy-threshold 85 --resolution 1080p
# Latest videos per channel with fuzzy matching
python download_karaoke.py --latest-per-channel --limit 3 --fuzzy-match --fuzzy-threshold 90 --file data/channels.txt
# Force refresh everything and download songlist
python download_karaoke.py --songlist-only --force-download-plan --refresh --limit 10
# High-quality download with custom cache duration
python download_karaoke.py --resolution 1080p --cache-duration 72 --limit 5 https://www.youtube.com/@SingKingKaraoke/videos
## 📋 COMMON WORKFLOWS
# 1. Quick songlist download (most common)
python download_karaoke.py --songlist-only --limit 10
# 1b. Focus on specific playlists (fast targeted download)
python download_karaoke.py --songlist-focus "2025 - Apple Top 50" --limit 5
# 2. Latest videos from all channels
python download_karaoke.py --latest-per-channel --limit 5
# 3. High-quality single channel download
python download_karaoke.py --resolution 1080p --limit 20 https://www.youtube.com/@SingKingKaraoke/videos
# 4. Fuzzy matching for better song discovery
python download_karaoke.py --songlist-only --fuzzy-match --fuzzy-threshold 80 --limit 15
# 4b. Focused fuzzy matching (target specific playlists with flexible matching)
python download_karaoke.py --songlist-focus "2025 - Apple Top 50" --fuzzy-match --fuzzy-threshold 80 --limit 10
# 5. Reset and start fresh
python download_karaoke.py --reset-channel SingKingKaraoke --reset-songlist
# 6. Check status and clear cache if needed
python download_karaoke.py --status
python download_karaoke.py --clear-cache all
## 🔧 TROUBLESHOOTING COMMANDS
# Check if everything is working
python download_karaoke.py --version
# Force refresh everything
python download_karaoke.py --force-download-plan --refresh --clear-cache all
# Reset everything and start fresh
python download_karaoke.py --reset-songlist-all
python download_karaoke.py --clear-server-duplicates
## 📝 NOTES
# Default files used:
# - data/channels.txt (default channel list for songlist modes)
# - data/songList.json (your prioritized song list)
# - data/config.json (download settings)
# Resolution options: 480p, 720p (default), 1080p, 1440p, 2160p
# Fuzzy threshold: 0-100 (higher = more strict matching, default 90)
# The system automatically:
# - Uses data/channels.txt if no --file specified in songlist modes
# - Caches channel data for 24 hours (configurable)
# - Tracks all downloads in JSON files
# - Avoids re-downloading existing files
# - Checks for server duplicates
# For best performance:
# - Use --limit for faster downloads
# - Use --fuzzy-match for better song discovery
# - Use --refresh sparingly (forces re-scan)
# - Clear cache if you encounter issues

File diff suppressed because it is too large Load Diff

View File

@ -1,4 +1,259 @@
[
{
"title": "2025 - Apple Top 50",
"songs": [
{
"position": 1,
"title": "luther",
"artist": "Kendrick Lamar & SZA"
},
{
"position": 2,
"title": "Not Like Us",
"artist": "Kendrick Lamar"
},
{
"position": 3,
"title": "30 For 30",
"artist": "SZA"
},
{
"position": 4,
"title": "I'm The Problem",
"artist": "Morgan Wallen"
},
{
"position": 5,
"title": "NOKIA",
"artist": "Drake"
},
{
"position": 6,
"title": "DtMF",
"artist": "Bad Bunny"
},
{
"position": 7,
"title": "Burning Blue",
"artist": "Mariah the Scientist"
},
{
"position": 8,
"title": "What I Want",
"artist": "Morgan Wallen & Tate McRae"
},
{
"position": 9,
"title": "GIMME A HUG",
"artist": "Drake"
},
{
"position": 10,
"title": "EVIL J0RDAN",
"artist": "Playboi Carti"
},
{
"position": 11,
"title": "What Did I Miss",
"artist": "Drake"
},
{
"position": 12,
"title": "Dum, Dumb, and Dumber",
"artist": "Lil Baby, Young Thug & Future"
},
{
"position": 13,
"title": "DAISIES",
"artist": "Justin Bieber"
},
{
"position": 14,
"title": "ALL I CAN TAKE",
"artist": "Justin Bieber"
},
{
"position": 15,
"title": "BAILE INoLVIDABLE",
"artist": "Bad Bunny"
},
{
"position": 16,
"title": "Just In Case",
"artist": "Morgan Wallen"
},
{
"position": 17,
"title": "Blue Strips",
"artist": "Jessie Murph"
},
{
"position": 18,
"title": "All The Way",
"artist": "BigXthaPlug & Bailey Zimmerman"
},
{
"position": 19,
"title": "I Ain't Comin' Back",
"artist": "Morgan Wallen & Post Malone"
},
{
"position": 20,
"title": "Superman",
"artist": "Morgan Wallen"
},
{
"position": 21,
"title": "CN TOWER",
"artist": "PARTYNEXTDOOR & Drake"
},
{
"position": 22,
"title": "Outside",
"artist": "Cardi B"
},
{
"position": 23,
"title": "KICK OUT",
"artist": "Travis Scott"
},
{
"position": 24,
"title": "RATHER LIE",
"artist": "Playboi Carti"
},
{
"position": 25,
"title": "Listen Up",
"artist": "Lil Baby"
},
{
"position": 26,
"title": "Smile",
"artist": "Morgan Wallen"
},
{
"position": 27,
"title": "tv off",
"artist": "Kendrick Lamar"
},
{
"position": 28,
"title": "I Got Better",
"artist": "Morgan Wallen"
},
{
"position": 29,
"title": "Cry For Me",
"artist": "The Weeknd"
},
{
"position": 30,
"title": "NUEVAYoL",
"artist": "Bad Bunny"
},
{
"position": 31,
"title": "By Myself",
"artist": "Lil Baby & Rylo Rodriguez"
},
{
"position": 32,
"title": "DUMBO",
"artist": "Travis Scott"
},
{
"position": 33,
"title": "Crazy Train",
"artist": "Ozzy Osbourne"
},
{
"position": 34,
"title": "Courtesy of the Red, White and Blue",
"artist": "Toby Keith"
},
{
"position": 35,
"title": "I'm A Little Crazy",
"artist": "Morgan Wallen"
},
{
"position": 36,
"title": "20 Cigarettes",
"artist": "Morgan Wallen"
},
{
"position": 37,
"title": "VOY A LLeVARTE PA PR",
"artist": "Bad Bunny"
},
{
"position": 38,
"title": "SOMETHING ABOUT YOU",
"artist": "PARTYNEXTDOOR & Drake"
},
{
"position": 39,
"title": "RATHER LIE",
"artist": "Playboi Carti & The Weeknd"
},
{
"position": 40,
"title": "GO BABY",
"artist": "Justin Bieber"
},
{
"position": 41,
"title": "F U 2x",
"artist": "Lil Baby"
},
{
"position": 42,
"title": "Vanish Mode",
"artist": "Lil Durk"
},
{
"position": 43,
"title": "CHAMPAIN & VACAY",
"artist": "Travis Scott, Don Toliver & Waka Flocka Flame"
},
{
"position": 44,
"title": "Die With A Smile",
"artist": "Lady Gaga & Bruno Mars"
},
{
"position": 45,
"title": "SOMEBODY LOVES ME",
"artist": "PARTYNEXTDOOR & Drake"
},
{
"position": 46,
"title": "squabble up",
"artist": "Kendrick Lamar"
},
{
"position": 47,
"title": "MOTH BALLS",
"artist": "PARTYNEXTDOOR & Drake"
},
{
"position": 48,
"title": "GOOD CREDIT",
"artist": "Playboi Carti & Kendrick Lamar"
},
{
"position": 49,
"title": "WAY IT IS",
"artist": "Justin Bieber & Gunna"
},
{
"position": 50,
"title": "They Want To Be You",
"artist": "Lil Durk"
}
]
},
{
"songs": [
{

View File

@ -37,6 +37,7 @@ Examples:
parser.add_argument('--songlist-priority', action='store_true', help='Prioritize downloads based on data/songList.json (default: enabled)')
parser.add_argument('--no-songlist-priority', action='store_true', help='Disable songlist prioritization')
parser.add_argument('--songlist-only', action='store_true', help='Only download songs that are in the songlist (skip all others)')
parser.add_argument('--songlist-focus', nargs='+', metavar='PLAYLIST_TITLE', help='Focus on specific playlists by title (e.g., --songlist-focus "2025 - Apple Top 50" "2024 - Billboard Hot 100")')
parser.add_argument('--songlist-status', action='store_true', help='Show songlist download status and statistics')
parser.add_argument('--reset-channel', metavar='CHANNEL_NAME', help='Reset all tracking and files for a channel')
parser.add_argument('--reset-songlist', action='store_true', help='When used with --reset-channel, also reset songlist songs for this channel')
@ -68,6 +69,11 @@ Examples:
if args.songlist_only:
downloader.songlist_only = True
print("🎯 Songlist-only mode enabled (will only download songlist songs)")
if args.songlist_focus:
downloader.songlist_focus_titles = args.songlist_focus
downloader.songlist_only = True # Enable songlist-only mode when focusing
print(f"🎯 Songlist focus mode enabled for playlists: {', '.join(args.songlist_focus)}")
if args.resolution != '720p':
resolution_map = {
'480p': '480',
@ -172,7 +178,7 @@ Examples:
if len(tracking) > 10:
print(f" ... and {len(tracking) - 10} more")
sys.exit(0)
elif args.songlist_only:
elif args.songlist_only or args.songlist_focus:
# Use provided file or default to data/channels.txt
channel_file = args.file if args.file else "data/channels.txt"
if not os.path.exists(channel_file):

View File

@ -45,8 +45,11 @@ def build_download_plan(channel_urls, undownloaded, tracker, yt_dlp_path, fuzzy_
song_lookup[key] = song
for i, channel_url in enumerate(channel_urls, 1):
print(f"\n🚦 Starting channel {i}/{len(channel_urls)}: {channel_url}")
print(f" 🔍 Getting channel info...")
channel_name, channel_id = get_channel_info(channel_url)
print(f"\n🚦 Starting channel {i}/{len(channel_urls)}: {channel_name} ({channel_url})")
print(f" ✅ Channel info: {channel_name} (ID: {channel_id})")
print(f" 🔍 Fetching video list from channel...")
available_videos = tracker.get_channel_video_list(
channel_url,
yt_dlp_path=str(yt_dlp_path),

View File

@ -51,6 +51,11 @@ class KaraokeDownloader:
self.songlist_tracking = load_songlist_tracking(str(self.songlist_tracking_file))
# Load server songs for availability checking
self.server_songs = load_server_songs()
# Songlist focus mode attributes
self.songlist_focus_titles = None
self.songlist_only = False
self.use_songlist_priority = True
self.download_limit = None
def _load_config(self):
config_file = DATA_DIR / "config.json"
@ -281,10 +286,70 @@ class KaraokeDownloader:
For each song in the songlist, try each channel in order and download from the first channel where it is found.
Download up to 'limit' songs, skipping any that cannot be found, until the limit is reached or all possible matches are exhausted.
"""
songlist = load_songlist()
if not songlist:
print("⚠️ No songlist loaded. Skipping.")
return False
# Apply songlist focus filtering if specified
if self.songlist_focus_titles:
# Load the raw songlist data to filter by playlist titles
songlist_file = Path("data/songList.json")
if not songlist_file.exists():
print("⚠️ Songlist file not found: data/songList.json")
return False
try:
with open(songlist_file, 'r', encoding='utf-8') as f:
raw_data = json.load(f)
# Filter playlists by title
focused_playlists = []
print(f"🔍 Looking for playlists: {self.songlist_focus_titles}")
print(f"🔍 Available playlists in songList.json:")
for i, playlist in enumerate(raw_data[:5]): # Show first 5 playlists
print(f" {i+1}. '{playlist.get('title', 'NO TITLE')}'")
if len(raw_data) > 5:
print(f" ... and {len(raw_data) - 5} more playlists")
for playlist in raw_data:
playlist_title = playlist.get('title', '')
if playlist_title in self.songlist_focus_titles:
focused_playlists.append(playlist)
print(f"✅ Found matching playlist: '{playlist_title}'")
if not focused_playlists:
print(f"⚠️ No playlists found matching the specified titles: {', '.join(self.songlist_focus_titles)}")
return False
# Flatten the focused playlists into songs
focused_songs = []
seen = set()
for playlist in focused_playlists:
if "songs" in playlist:
for song in playlist["songs"]:
if "artist" in song and "title" in song:
artist = song["artist"].strip()
title = song["title"].strip()
key = f"{artist.lower()}_{title.lower()}"
if key in seen:
continue
seen.add(key)
focused_songs.append({
"artist": artist,
"title": title,
"position": song.get("position", 0)
})
songlist = focused_songs
print(f"\n🎯 Songlist focus mode: {len(focused_songs)} songs from {len(focused_playlists)} playlists selected")
print(f"🎯 Focused playlists: {', '.join(self.songlist_focus_titles)}")
except (json.JSONDecodeError, FileNotFoundError) as e:
print(f"⚠️ Could not load songlist for filtering: {e}")
return False
else:
# Load songlist normally (flattened from all playlists)
songlist = load_songlist()
if not songlist:
print("⚠️ No songlist loaded. Skipping.")
return False
# Filter for songs not yet downloaded
undownloaded = [s for s in songlist if not is_songlist_song_downloaded(self.songlist_tracking, s['artist'], s['title'])]
print(f"\n🎯 {len(songlist)} total unique songs in songlist.")
@ -324,92 +389,8 @@ class KaraokeDownloader:
if not undownloaded:
print("🎵 All songlist songs already downloaded.")
return True
# --- FAST MODE: Early exit and deduplication if limit is set ---
if limit is not None:
print("\n⚡ Fast mode enabled: will stop as soon as limit is reached with successful downloads.")
similarity = get_similarity_function()
downloaded_count = 0
unique_keys = set()
total_attempted = 0
for channel_url in channel_urls:
channel_name, channel_id = get_channel_info(channel_url)
available_videos = self.tracker.get_channel_video_list(
channel_url,
yt_dlp_path=str(self.yt_dlp_path),
force_refresh=False
)
for song in undownloaded:
artist, title = song['artist'], song['title']
key = create_song_key(artist, title)
if key in unique_keys:
continue # Already downloaded or queued
# Check if should skip this song during planning phase
should_skip, reason, _ = self._should_skip_song(
artist, title, channel_name, None, f"{artist} - {title}",
server_songs, server_duplicates_tracking
)
if should_skip:
continue
found = False
for video in available_videos:
v_artist, v_title = extract_artist_title(video['title'])
video_key = create_song_key(v_artist, v_title)
if fuzzy_match:
score = similarity(key, video_key)
if score >= fuzzy_threshold:
found = True
else:
if is_exact_match(artist, title, video['title']):
found = True
if found:
print(f"\n⬇️ Downloading {downloaded_count+1} of {limit}:")
print(f" 📋 Songlist: {artist} - {title}")
print(f" 🎬 Video: {video['title']} ({channel_name})")
if fuzzy_match:
print(f" 🎯 Match Score: {score:.1f}%")
# --- Download logic (reuse from below) ---
safe_title = title.replace("(From ", "").replace(")", "").replace(" - ", " ").replace(":", "").replace("'", "").replace('"', "")
safe_artist = artist.replace("'", "").replace('"', "")
invalid_chars = ['?', ':', '*', '"', '<', '>', '|', '/', '\\']
for char in invalid_chars:
safe_title = safe_title.replace(char, "")
safe_artist = safe_artist.replace(char, "")
safe_title = safe_title.replace("...", "").replace("..", "").replace(".", "").strip()
safe_artist = safe_artist.strip()
filename = f"{safe_artist} - {safe_title}.mp4"
# Call the actual download function (simulate the same as in the plan loop)
success = download_video_and_track(
self.yt_dlp_path,
self.config,
self.downloads_dir,
self.songlist_tracking,
channel_name,
channel_url,
video['id'],
video['title'],
artist,
title,
filename
)
total_attempted += 1
if success:
downloaded_count += 1
unique_keys.add(key)
print(f"✅ Downloaded and tracked: {artist} - {title}")
else:
print(f"❌ Download failed: {artist} - {title}")
if downloaded_count >= limit:
print(f"🎉 Reached download limit ({limit}). Stopping early.")
return True
break # Don't try to match this song to other videos in this channel
print(f"🎉 Downloaded {downloaded_count} unique songlist songs (limit was {limit}).")
if downloaded_count < limit:
print(f"⚠️ Only {downloaded_count} songs were downloaded. Some may not have been found or downloads failed.")
return True
# --- ORIGINAL FULL PLAN MODE (no limit) ---
# --- Download plan building (same for both normal and focus modes) ---
# --- Download plan cache logic ---
plan_mode = "songlist"
# Include all parameters that affect the plan generation
@ -425,10 +406,12 @@ class KaraokeDownloader:
cache_file = get_download_plan_cache_file(plan_mode, **plan_kwargs)
use_cache = False
download_plan, unmatched = load_cached_plan(cache_file)
if not force_refresh_download_plan and download_plan is not None:
if not force_refresh_download_plan and download_plan is not None and unmatched is not None:
use_cache = True
print(f"\n📋 Using cached download plan from: {cache_file}")
if not use_cache:
print("\n🔍 Pre-scanning channels for matches...")
print(f"\n🔍 Pre-scanning {len(channel_urls)} channels for matches...")
print(f"🔍 Scanning {len(undownloaded)} songs against all channels...")
download_plan, unmatched = build_download_plan(
channel_urls,
undownloaded,
@ -438,6 +421,7 @@ class KaraokeDownloader:
fuzzy_threshold=fuzzy_threshold
)
save_plan_cache(cache_file, download_plan, unmatched)
print(f"💾 Download plan cached to: {cache_file}")
print(f"\n📊 Download plan ready: {len(download_plan)} songs will be downloaded.")
print(f"{len(unmatched)} songs could not be found in any channel.")
if unmatched:
@ -446,6 +430,7 @@ class KaraokeDownloader:
print(f" - {song['artist']} - {song['title']}")
if len(unmatched) > DEFAULT_DISPLAY_LIMIT:
print(f" ...and {len(unmatched)-DEFAULT_DISPLAY_LIMIT} more.")
# --- Download phase ---
downloaded_count, success = execute_download_plan(
download_plan=download_plan,

View File

@ -243,8 +243,10 @@ class TrackingManager:
channel_name, channel_id = get_channel_info(channel_url)
cache_key = channel_id or channel_url
if not force_refresh and cache_key in self.cache:
print(f" 📋 Using cached video list ({len(self.cache[cache_key])} videos)")
return self.cache[cache_key]
# Fetch with yt-dlp
print(f" 🌐 Fetching video list from YouTube (this may take a while)...")
import subprocess
cmd = [
yt_dlp_path,

View File

@ -7,20 +7,29 @@ import json
from pathlib import Path
from typing import List, Dict, Any, Optional
def get_channel_info(channel_url: str, yt_dlp_path: str = "downloader/yt-dlp.exe") -> Dict[str, Any]:
"""Get channel information using yt-dlp."""
def get_channel_info(channel_url: str, yt_dlp_path: str = "downloader/yt-dlp.exe") -> tuple[str, str]:
"""Get channel information using yt-dlp. Returns (channel_name, channel_id)."""
try:
cmd = [
yt_dlp_path,
"--dump-json",
"--no-playlist",
channel_url
]
result = subprocess.run(cmd, capture_output=True, text=True, check=True)
return json.loads(result.stdout)
except subprocess.CalledProcessError as e:
# Extract channel name from URL for now (faster than calling yt-dlp)
if "/@" in channel_url:
channel_name = channel_url.split("/@")[1].split("/")[0]
elif "/channel/" in channel_url:
channel_name = channel_url.split("/channel/")[1].split("/")[0]
else:
channel_name = "Unknown"
# Extract channel ID from URL
if "/channel/" in channel_url:
channel_id = channel_url.split("/channel/")[1].split("/")[0]
elif "/@" in channel_url:
channel_id = channel_url.split("/@")[1].split("/")[0]
else:
channel_id = channel_url
return channel_name, channel_id
except Exception as e:
print(f"❌ Failed to get channel info: {e}")
return {}
return "Unknown", channel_url
def get_playlist_info(playlist_url: str, yt_dlp_path: str = "downloader/yt-dlp.exe") -> List[Dict[str, Any]]:
"""Get playlist information using yt-dlp."""