Signed-off-by: Matt Bruce <mbrucedogs@gmail.com>
This commit is contained in:
parent
d184724c70
commit
dd916a646a
10
PRD.md
10
PRD.md
@ -51,6 +51,7 @@ These principles are fundamental to the project's long-term success and must be
|
||||
- **CLI Commands Documentation:** All CLI functionality, options, and usage examples must be documented in `cli/commands.txt`
|
||||
- **Code Comments:** Significant logic changes should include inline documentation
|
||||
- **API Documentation:** New endpoints, functions, or interfaces must be documented
|
||||
- **API Update Requirement:** Whenever a new API endpoint is added, the PRD.md, README.md, and cli/commands.txt MUST be updated to reflect the new functionality
|
||||
|
||||
**Documentation Update Checklist:**
|
||||
- [ ] Update PRD.md with any architectural or requirement changes
|
||||
@ -59,6 +60,7 @@ These principles are fundamental to the project's long-term success and must be
|
||||
- [ ] Add inline comments for complex logic or business rules
|
||||
- [ ] Update any configuration examples or file structure documentation
|
||||
- [ ] Review and update implementation status sections
|
||||
- [ ] **API Updates:** When new API endpoints are added, update PRD.md, README.md, and cli/commands.txt
|
||||
|
||||
**CLI Commands Documentation Requirements:**
|
||||
- **Comprehensive Coverage:** All CLI arguments, options, and flags must be documented with examples
|
||||
@ -68,6 +70,14 @@ These principles are fundamental to the project's long-term success and must be
|
||||
- **Integration Notes:** Document how CLI integrates with web UI and other components
|
||||
- **Version Tracking:** Keep version information and feature status up to date
|
||||
|
||||
**API Documentation Requirements:**
|
||||
- **Endpoint Documentation:** All new API endpoints must be documented in the PRD.md with their purpose, parameters, and responses
|
||||
- **README Integration:** API changes must be reflected in README.md with usage examples and integration notes
|
||||
- **CLI Integration:** If CLI commands interact with APIs, they must be documented in cli/commands.txt
|
||||
- **Version Tracking:** API versioning and changes must be tracked in documentation
|
||||
- **Error Handling:** Document all possible error responses and status codes
|
||||
- **Authentication:** Document any authentication requirements or API key usage
|
||||
|
||||
This documentation requirement is mandatory and ensures the project remains maintainable and accessible to future developers and users.
|
||||
|
||||
### 2.3 Code Quality & Development Standards
|
||||
|
||||
40
README.md
40
README.md
@ -10,6 +10,7 @@ A comprehensive tool for analyzing, deduplicating, and cleaning up large karaoke
|
||||
- **CDG/MP3 Pairing**: Treats CDG and MP3 files with the same base filename as single karaoke units
|
||||
- **Channel Priority**: For MP4 files, prioritizes based on folder names in the path
|
||||
- **Fuzzy Matching**: Configurable fuzzy matching for artist/title comparison
|
||||
- **Playlist Validation**: Validates playlists against your song library with exact and fuzzy matching
|
||||
|
||||
### File Type Priority System
|
||||
1. **MP4 files** (with channel priority sorting)
|
||||
@ -32,12 +33,34 @@ A comprehensive tool for analyzing, deduplicating, and cleaning up large karaoke
|
||||
|
||||
## Installation
|
||||
|
||||
1. Clone the repository
|
||||
### Prerequisites
|
||||
|
||||
- Python 3.7 or higher
|
||||
- pip (Python package installer)
|
||||
|
||||
### Installation Steps
|
||||
|
||||
1. Clone the repository:
|
||||
```bash
|
||||
git clone <repository-url>
|
||||
cd KaraokeMerge
|
||||
```
|
||||
|
||||
2. Install dependencies:
|
||||
```bash
|
||||
pip install -r requirements.txt
|
||||
```
|
||||
|
||||
**Note**: The installation includes:
|
||||
- **Flask** for the web UI
|
||||
- **fuzzywuzzy** and **python-Levenshtein** for fuzzy matching in playlist validation
|
||||
- All other required dependencies
|
||||
|
||||
3. Verify installation:
|
||||
```bash
|
||||
python -c "import flask, fuzzywuzzy; print('All dependencies installed successfully!')"
|
||||
```
|
||||
|
||||
## Usage
|
||||
|
||||
### CLI Tool
|
||||
@ -64,6 +87,21 @@ The web UI will automatically:
|
||||
2. Start the Flask server
|
||||
3. Open your default browser to the interface
|
||||
|
||||
### Playlist Validation
|
||||
|
||||
Validate your playlists against your song library:
|
||||
```bash
|
||||
cd cli
|
||||
python playlist_validator.py
|
||||
```
|
||||
|
||||
Options:
|
||||
- `--playlist-index N`: Validate a specific playlist by index
|
||||
- `--output results.json`: Save results to a JSON file
|
||||
- `--apply`: Apply corrections to playlists (use with caution)
|
||||
|
||||
**Note**: Playlist validation uses fuzzy matching to find potential matches. Make sure fuzzywuzzy is installed for best results.
|
||||
|
||||
### Priority Preferences
|
||||
|
||||
The web UI now supports drag-and-drop priority management:
|
||||
|
||||
443
cli/commands.txt
443
cli/commands.txt
@ -1,77 +1,117 @@
|
||||
# Karaoke Song Library Cleanup Tool - CLI Commands Reference
|
||||
# Karaoke Song Library Cleanup Tool - CLI Commands Reference (v2.0)
|
||||
|
||||
## Overview
|
||||
The CLI tool analyzes karaoke song collections, identifies duplicates, and generates skip lists for future imports. It supports multiple file formats (MP3, CDG, MP4) with configurable priority systems.
|
||||
The CLI tool analyzes karaoke song collections, identifies duplicates, validates playlists, and generates skip lists for future imports. It supports multiple file formats (MP3, CDG, MP4) with configurable priority systems.
|
||||
|
||||
## Basic Usage
|
||||
## Quick Start Commands
|
||||
|
||||
### Standard Analysis
|
||||
### Basic Analysis (Most Common)
|
||||
```bash
|
||||
python cli/main.py
|
||||
cd cli
|
||||
python3 main.py
|
||||
```
|
||||
Runs the tool with default settings:
|
||||
- Input: `data/allSongs.json`
|
||||
- Config: `config/config.json`
|
||||
- Output: `data/skipSongs.json`
|
||||
- Verbose: Disabled
|
||||
- Reports: **Automatically generated** (including web UI data)
|
||||
- Reports: **Automatically generated**
|
||||
|
||||
### Verbose Output
|
||||
### Process Everything (Recommended)
|
||||
```bash
|
||||
python cli/main.py --verbose
|
||||
cd cli
|
||||
python3 main.py --process-all
|
||||
```
|
||||
Complete processing including:
|
||||
- Duplicate analysis and skip list generation
|
||||
- Favorites processing with priority logic (MP4 over MP3)
|
||||
- History processing with priority logic
|
||||
- Comprehensive report generation
|
||||
|
||||
## Main CLI Commands (main.py)
|
||||
|
||||
### Basic Analysis Commands
|
||||
|
||||
#### Standard Analysis
|
||||
```bash
|
||||
python3 main.py
|
||||
```
|
||||
Runs the tool with default settings and generates all reports automatically.
|
||||
|
||||
#### Verbose Output
|
||||
```bash
|
||||
python3 main.py --verbose
|
||||
# or
|
||||
python cli/main.py -v
|
||||
python3 main.py -v
|
||||
```
|
||||
Enables detailed output showing:
|
||||
- Individual song processing
|
||||
- Duplicate detection details
|
||||
- File type analysis
|
||||
- Channel priority decisions
|
||||
Enables detailed output showing individual song processing and decisions.
|
||||
|
||||
### Dry Run Mode
|
||||
#### Dry Run Mode
|
||||
```bash
|
||||
python cli/main.py --dry-run
|
||||
python3 main.py --dry-run
|
||||
```
|
||||
Analyzes songs without generating the skip list file. Useful for:
|
||||
- Testing configuration changes
|
||||
- Previewing results before committing
|
||||
- Validating input data
|
||||
Analyzes songs without generating the skip list file. Useful for testing and previewing results.
|
||||
|
||||
## Configuration Options
|
||||
### Configuration Commands
|
||||
|
||||
### Custom Configuration File
|
||||
#### Custom Configuration File
|
||||
```bash
|
||||
python cli/main.py --config path/to/custom_config.json
|
||||
python3 main.py --config path/to/custom_config.json
|
||||
```
|
||||
Uses a custom configuration file instead of the default `config/config.json`.
|
||||
|
||||
### Show Current Configuration
|
||||
#### Show Current Configuration
|
||||
```bash
|
||||
python cli/main.py --show-config
|
||||
python3 main.py --show-config
|
||||
```
|
||||
Displays the current configuration settings and exits. Useful for:
|
||||
- Verifying configuration values
|
||||
- Debugging configuration issues
|
||||
- Understanding current settings
|
||||
Displays the current configuration settings and exits.
|
||||
|
||||
## Input/Output Options
|
||||
### Input/Output Commands
|
||||
|
||||
### Custom Input File
|
||||
#### Custom Input File
|
||||
```bash
|
||||
python cli/main.py --input path/to/songs.json
|
||||
python3 main.py --input path/to/songs.json
|
||||
```
|
||||
Specifies a custom input file instead of the default `data/allSongs.json`.
|
||||
|
||||
### Custom Output Directory
|
||||
#### Custom Output Directory
|
||||
```bash
|
||||
python cli/main.py --output-dir ./custom_output
|
||||
python3 main.py --output-dir ./custom_output
|
||||
```
|
||||
Saves output files to a custom directory instead of the default `data/` folder.
|
||||
|
||||
## Report Generation
|
||||
### Processing Commands
|
||||
|
||||
### Detailed Reports (Always Generated)
|
||||
Reports are now **automatically generated** every time you run the CLI tool. The `--save-reports` flag is kept for backward compatibility but is no longer required.
|
||||
#### Process Favorites Only
|
||||
```bash
|
||||
python3 main.py --process-favorites
|
||||
```
|
||||
Processes favorites with priority-based logic to select best versions (MP4 over MP3).
|
||||
|
||||
#### Process History Only
|
||||
```bash
|
||||
python3 main.py --process-history
|
||||
```
|
||||
Processes history with priority-based logic to select best versions (MP4 over MP3).
|
||||
|
||||
#### Process Everything
|
||||
```bash
|
||||
python3 main.py --process-all
|
||||
```
|
||||
Processes everything: duplicates, generates reports, AND updates favorites/history with priority logic.
|
||||
|
||||
#### Merge History Objects
|
||||
```bash
|
||||
python3 main.py --merge-history
|
||||
```
|
||||
Merges history objects that match on artist, title, and path, summing their count properties.
|
||||
|
||||
### Report Generation
|
||||
|
||||
#### Save Detailed Reports (Legacy)
|
||||
```bash
|
||||
python3 main.py --save-reports
|
||||
```
|
||||
**Note**: Reports are now automatically generated every time you run the CLI tool. This flag is kept for backward compatibility.
|
||||
|
||||
Generated reports include:
|
||||
- `enhanced_summary_report.txt` - Comprehensive analysis
|
||||
@ -82,43 +122,244 @@ Generated reports include:
|
||||
- `analysis_data.json` - Raw analysis data for further processing
|
||||
- `skip_songs_detailed.json` - **Web UI data (always generated)**
|
||||
|
||||
## Combined Examples
|
||||
## Playlist Validator Commands (playlist_validator.py)
|
||||
|
||||
### Full Analysis with Reports
|
||||
### Basic Playlist Validation
|
||||
|
||||
#### Validate All Playlists
|
||||
```bash
|
||||
python cli/main.py --verbose
|
||||
python3 playlist_validator.py
|
||||
```
|
||||
Runs complete analysis with:
|
||||
Validates all playlists in `data/songLists.json` against the song library.
|
||||
|
||||
#### Validate Specific Playlist
|
||||
```bash
|
||||
python3 playlist_validator.py --playlist-index 0
|
||||
```
|
||||
Validates a specific playlist by index (0-based).
|
||||
|
||||
### Playlist Validator Options
|
||||
|
||||
#### Custom Configuration
|
||||
```bash
|
||||
python3 playlist_validator.py --config path/to/custom_config.json
|
||||
```
|
||||
Uses a custom configuration file.
|
||||
|
||||
#### Custom Data Directory
|
||||
```bash
|
||||
python3 playlist_validator.py --data-dir path/to/data
|
||||
```
|
||||
Uses a custom data directory.
|
||||
|
||||
#### Apply Changes (Disable Dry Run)
|
||||
```bash
|
||||
python3 playlist_validator.py --apply
|
||||
```
|
||||
Applies changes to playlists instead of just previewing them.
|
||||
|
||||
#### Output Results to File
|
||||
```bash
|
||||
python3 playlist_validator.py --output results.json
|
||||
```
|
||||
Saves validation results to a JSON file.
|
||||
|
||||
## Comprehensive Examples
|
||||
|
||||
### Complete Workflow Examples
|
||||
|
||||
#### 1. Full Analysis with Everything
|
||||
```bash
|
||||
cd cli
|
||||
python3 main.py --process-all --verbose
|
||||
```
|
||||
Complete processing with detailed output:
|
||||
- Duplicate analysis and skip list generation
|
||||
- Favorites and history processing with priority logic
|
||||
- Comprehensive report generation
|
||||
- Verbose output for detailed processing information
|
||||
- **Automatic comprehensive report generation**
|
||||
- Skip list creation
|
||||
|
||||
### Custom Configuration with Dry Run
|
||||
#### 2. Preview Changes Before Applying
|
||||
```bash
|
||||
python cli/main.py --config custom_config.json --dry-run --verbose
|
||||
cd cli
|
||||
python3 main.py --process-all --dry-run --verbose
|
||||
```
|
||||
Tests a custom configuration without generating files:
|
||||
- Uses custom configuration
|
||||
Preview all changes without saving:
|
||||
- Shows what would be processed
|
||||
- No files are modified
|
||||
- Useful for testing configuration changes
|
||||
|
||||
#### 3. Custom Configuration Testing
|
||||
```bash
|
||||
cd cli
|
||||
python3 main.py --config custom_config.json --dry-run --verbose
|
||||
```
|
||||
Test a custom configuration:
|
||||
- Uses custom configuration file
|
||||
- Shows detailed processing
|
||||
- No output files created
|
||||
|
||||
### Custom Input/Output with Reports
|
||||
#### 4. Process Only Favorites and History
|
||||
```bash
|
||||
python cli/main.py --input /path/to/songs.json --output-dir ./reports
|
||||
cd cli
|
||||
python3 main.py --process-favorites --process-history
|
||||
```
|
||||
Process only favorites and history files:
|
||||
- Updates favorites with best versions (MP4 over MP3)
|
||||
- Updates history with best versions
|
||||
- No duplicate analysis performed
|
||||
|
||||
#### 5. Merge History Objects
|
||||
```bash
|
||||
cd cli
|
||||
python3 main.py --merge-history --dry-run
|
||||
```
|
||||
Preview history merging:
|
||||
- Shows which history objects would be merged
|
||||
- No files are modified
|
||||
|
||||
#### 6. Apply History Merging
|
||||
```bash
|
||||
cd cli
|
||||
python3 main.py --merge-history
|
||||
```
|
||||
Actually merge history objects:
|
||||
- Combines duplicate history entries
|
||||
- Sums count properties
|
||||
- Saves updated history file
|
||||
|
||||
### Playlist Validation Examples
|
||||
|
||||
#### 1. Validate All Playlists
|
||||
```bash
|
||||
cd cli
|
||||
python3 playlist_validator.py
|
||||
```
|
||||
Validates all playlists and shows summary:
|
||||
- Total playlists and songs
|
||||
- Exact matches found
|
||||
- Missing songs count
|
||||
- Fuzzy matches (if available)
|
||||
|
||||
#### 2. Validate Specific Playlist
|
||||
```bash
|
||||
cd cli
|
||||
python3 playlist_validator.py --playlist-index 5
|
||||
```
|
||||
Validates playlist at index 5:
|
||||
- Shows detailed results for that specific playlist
|
||||
- Lists exact matches and missing songs
|
||||
|
||||
#### 3. Save Validation Results
|
||||
```bash
|
||||
cd cli
|
||||
python3 playlist_validator.py --output validation_results.json
|
||||
```
|
||||
Saves detailed validation results to JSON file for further analysis.
|
||||
|
||||
#### 4. Apply Playlist Corrections
|
||||
```bash
|
||||
cd cli
|
||||
python3 playlist_validator.py --apply
|
||||
```
|
||||
Applies corrections to playlists (use with caution).
|
||||
|
||||
### Advanced Examples
|
||||
|
||||
#### 1. Custom Input/Output with Full Processing
|
||||
```bash
|
||||
cd cli
|
||||
python3 main.py --input /path/to/songs.json --output-dir ./reports --process-all --verbose
|
||||
```
|
||||
Processes custom input and saves all outputs to reports directory:
|
||||
- Custom input file
|
||||
- Custom output location
|
||||
- **All report files automatically generated**
|
||||
- Full processing including favorites/history
|
||||
- Verbose output
|
||||
|
||||
### Minimal Output
|
||||
#### 2. Configuration Testing Workflow
|
||||
```bash
|
||||
python cli/main.py --output-dir ./minimal
|
||||
cd cli
|
||||
# Show current configuration
|
||||
python3 main.py --show-config
|
||||
|
||||
# Test with dry run
|
||||
python3 main.py --dry-run --verbose
|
||||
|
||||
# Test with custom config
|
||||
python3 main.py --config test_config.json --dry-run --verbose
|
||||
```
|
||||
Runs with minimal output:
|
||||
- No verbose logging
|
||||
- No detailed reports
|
||||
- Only generates skip list
|
||||
|
||||
#### 3. Playlist Analysis Workflow
|
||||
```bash
|
||||
cd cli
|
||||
# Validate all playlists
|
||||
python3 playlist_validator.py
|
||||
|
||||
# Validate specific playlist
|
||||
python3 playlist_validator.py --playlist-index 0
|
||||
|
||||
# Save detailed results
|
||||
python3 playlist_validator.py --output playlist_analysis.json
|
||||
```
|
||||
|
||||
#### 4. Complete System Analysis
|
||||
```bash
|
||||
cd cli
|
||||
# Process everything
|
||||
python3 main.py --process-all --verbose
|
||||
|
||||
# Validate playlists
|
||||
python3 playlist_validator.py
|
||||
|
||||
# Show configuration
|
||||
python3 main.py --show-config
|
||||
```
|
||||
|
||||
## Command Line Options Reference
|
||||
|
||||
### Main CLI (main.py) Options
|
||||
|
||||
| Option | Description | Default |
|
||||
|--------|-------------|---------|
|
||||
| `--config` | Configuration file path | `../config/config.json` |
|
||||
| `--input` | Input songs file path | `../data/allSongs.json` |
|
||||
| `--output-dir` | Output directory | `../data` |
|
||||
| `--verbose, -v` | Enable verbose output | `False` |
|
||||
| `--dry-run` | Analyze without generating files | `False` |
|
||||
| `--save-reports` | Save detailed reports | `True` (always enabled) |
|
||||
| `--show-config` | Show configuration and exit | `False` |
|
||||
| `--process-favorites` | Process favorites with priority logic | `False` |
|
||||
| `--process-history` | Process history with priority logic | `False` |
|
||||
| `--process-all` | Process everything | `False` |
|
||||
| `--merge-history` | Merge history objects | `False` |
|
||||
|
||||
### Playlist Validator (playlist_validator.py) Options
|
||||
|
||||
| Option | Description | Default |
|
||||
|--------|-------------|---------|
|
||||
| `--config` | Configuration file path | `../config/config.json` |
|
||||
| `--data-dir` | Data directory path | `../data` |
|
||||
| `--dry-run` | Dry run mode | `True` |
|
||||
| `--apply` | Apply changes (disable dry run) | `False` |
|
||||
| `--playlist-index` | Validate specific playlist by index | `None` |
|
||||
| `--output` | Output results to JSON file | `None` |
|
||||
|
||||
## File Structure Requirements
|
||||
|
||||
### Required Files
|
||||
- `data/allSongs.json` - Main song library
|
||||
- `config/config.json` - Configuration settings
|
||||
|
||||
### Optional Files
|
||||
- `data/favorites.json` - Favorites list (for processing)
|
||||
- `data/history.json` - History list (for processing)
|
||||
- `data/songLists.json` - Playlists (for validation)
|
||||
|
||||
### Generated Files
|
||||
- `data/skipSongs.json` - Skip list for future imports
|
||||
- `data/reports/` - Directory containing all analysis reports
|
||||
- `data/preferences/` - Directory containing priority preferences
|
||||
|
||||
## Configuration File Structure
|
||||
|
||||
@ -148,31 +389,9 @@ The default configuration file (`config/config.json`) contains:
|
||||
}
|
||||
```
|
||||
|
||||
### Configuration Options Explained
|
||||
|
||||
#### Channel Priorities
|
||||
- **channel_priorities**: Array of folder names for MP4 files
|
||||
- Order determines priority (first = highest priority)
|
||||
- Files without matching folders are marked for manual review
|
||||
|
||||
#### Matching Settings
|
||||
- **fuzzy_matching**: Enable/disable fuzzy string matching
|
||||
- **fuzzy_threshold**: Similarity threshold (0.0-1.0) for fuzzy matching
|
||||
- **case_sensitive**: Case-sensitive artist/title comparison
|
||||
|
||||
#### Output Settings
|
||||
- **verbose**: Enable detailed output
|
||||
- **include_reasons**: Include reason field in skip list
|
||||
- **max_duplicates_per_song**: Maximum duplicates to process per song
|
||||
|
||||
#### File Type Settings
|
||||
- **supported_extensions**: All supported file extensions
|
||||
- **mp4_extensions**: Extensions treated as MP4 files
|
||||
|
||||
## Input File Format
|
||||
|
||||
The tool expects a JSON array of song objects:
|
||||
## Input File Formats
|
||||
|
||||
### Song Library Format (allSongs.json)
|
||||
```json
|
||||
[
|
||||
{
|
||||
@ -183,9 +402,45 @@ The tool expects a JSON array of song objects:
|
||||
]
|
||||
```
|
||||
|
||||
Optional fields for MP4 files:
|
||||
- `channel`: Channel/folder information
|
||||
- ID3 tag information (artist, title, etc.)
|
||||
### Playlist Format (songLists.json)
|
||||
```json
|
||||
[
|
||||
{
|
||||
"title": "Playlist Name",
|
||||
"songs": [
|
||||
{
|
||||
"position": 1,
|
||||
"artist": "Artist Name",
|
||||
"title": "Song Title"
|
||||
}
|
||||
]
|
||||
}
|
||||
]
|
||||
```
|
||||
|
||||
### Favorites Format (favorites.json)
|
||||
```json
|
||||
[
|
||||
{
|
||||
"artist": "Artist Name",
|
||||
"title": "Song Title",
|
||||
"path": "path/to/file.mp3",
|
||||
"favorite": true
|
||||
}
|
||||
]
|
||||
```
|
||||
|
||||
### History Format (history.json)
|
||||
```json
|
||||
[
|
||||
{
|
||||
"artist": "Artist Name",
|
||||
"title": "Song Title",
|
||||
"path": "path/to/file.mp3",
|
||||
"count": 5
|
||||
}
|
||||
]
|
||||
```
|
||||
|
||||
## Output Files
|
||||
|
||||
@ -193,7 +448,7 @@ Optional fields for MP4 files:
|
||||
- **skipSongs.json**: List of file paths to skip in future imports
|
||||
- Format: `[{"path": "file/path.mp3", "reason": "duplicate"}]`
|
||||
|
||||
### Report Files (with --save-reports)
|
||||
### Report Files (Automatically Generated)
|
||||
- **enhanced_summary_report.txt**: Overall analysis and statistics
|
||||
- **channel_optimization_report.txt**: Channel priority suggestions
|
||||
- **duplicate_pattern_report.txt**: Duplicate detection patterns
|
||||
@ -222,7 +477,7 @@ The tool provides clear error messages for:
|
||||
|
||||
## Performance Notes
|
||||
|
||||
- Successfully tested with 37,000+ songs
|
||||
- Successfully tested with 49,000+ songs
|
||||
- Processes large datasets efficiently
|
||||
- Shows progress indicators for long operations
|
||||
- Memory-efficient processing
|
||||
@ -245,17 +500,29 @@ The CLI tool integrates with the web UI:
|
||||
|
||||
### Debug Mode
|
||||
```bash
|
||||
python cli/main.py --verbose --dry-run --show-config
|
||||
cd cli
|
||||
python3 main.py --verbose --dry-run --show-config
|
||||
```
|
||||
Complete debugging setup:
|
||||
- Shows configuration
|
||||
- Verbose processing
|
||||
- No file changes
|
||||
|
||||
### Playlist Validator Debug
|
||||
```bash
|
||||
cd cli
|
||||
python3 playlist_validator.py --dry-run --output debug_results.json
|
||||
```
|
||||
Debug playlist validation:
|
||||
- Dry run mode
|
||||
- Save results to file
|
||||
- No playlist modifications
|
||||
|
||||
## Version Information
|
||||
|
||||
This commands reference is for Karaoke Song Library Cleanup Tool v2.0
|
||||
- CLI: Fully functional with comprehensive options
|
||||
- Web UI: Interactive priority management
|
||||
- Priority System: Drag-and-drop with persistence
|
||||
- Reports: Enhanced analysis with actionable insights
|
||||
- Reports: Enhanced analysis with actionable insights
|
||||
- Playlist Validator: Complete playlist analysis and validation
|
||||
124359
cli/complete_playlist_validation.json
Normal file
124359
cli/complete_playlist_validation.json
Normal file
File diff suppressed because it is too large
Load Diff
98157
cli/final_playlist_validation.json
Normal file
98157
cli/final_playlist_validation.json
Normal file
File diff suppressed because it is too large
Load Diff
@ -17,6 +17,7 @@ from utils import (
|
||||
extract_consolidated_channel_from_path,
|
||||
get_file_extension,
|
||||
parse_multi_artist,
|
||||
clean_artist_name,
|
||||
validate_song_data,
|
||||
find_mp3_pairs
|
||||
)
|
||||
@ -63,10 +64,15 @@ class SongMatcher:
|
||||
if not validate_song_data(song):
|
||||
continue
|
||||
|
||||
# Handle multi-artist songs
|
||||
artists = parse_multi_artist(song['artist'])
|
||||
# Clean and handle artist names
|
||||
cleaned_artist = clean_artist_name(song['artist'])
|
||||
if not cleaned_artist:
|
||||
cleaned_artist = song['artist'] # Fallback to original if cleaning fails
|
||||
|
||||
# Handle multi-artist songs (after cleaning)
|
||||
artists = parse_multi_artist(cleaned_artist)
|
||||
if not artists:
|
||||
artists = [song['artist']]
|
||||
artists = [cleaned_artist]
|
||||
|
||||
# Create groups for each artist variation
|
||||
for artist in artists:
|
||||
@ -90,10 +96,15 @@ class SongMatcher:
|
||||
if i % 1000 == 0 and i > 0:
|
||||
print(f"Processing song {i:,}/{len(songs):,}...")
|
||||
|
||||
# Handle multi-artist songs
|
||||
artists = parse_multi_artist(song['artist'])
|
||||
# Clean and handle artist names
|
||||
cleaned_artist = clean_artist_name(song['artist'])
|
||||
if not cleaned_artist:
|
||||
cleaned_artist = song['artist'] # Fallback to original if cleaning fails
|
||||
|
||||
# Handle multi-artist songs (after cleaning)
|
||||
artists = parse_multi_artist(cleaned_artist)
|
||||
if not artists:
|
||||
artists = [song['artist']]
|
||||
artists = [cleaned_artist]
|
||||
|
||||
# Try exact matching first
|
||||
added_to_exact = False
|
||||
@ -117,10 +128,15 @@ class SongMatcher:
|
||||
if i % 100 == 0 and i > 0:
|
||||
print(f"Fuzzy matching song {i:,}/{len(ungrouped_songs):,}...")
|
||||
|
||||
# Handle multi-artist songs
|
||||
artists = parse_multi_artist(song['artist'])
|
||||
# Clean and handle artist names
|
||||
cleaned_artist = clean_artist_name(song['artist'])
|
||||
if not cleaned_artist:
|
||||
cleaned_artist = song['artist'] # Fallback to original if cleaning fails
|
||||
|
||||
# Handle multi-artist songs (after cleaning)
|
||||
artists = parse_multi_artist(cleaned_artist)
|
||||
if not artists:
|
||||
artists = [song['artist']]
|
||||
artists = [cleaned_artist]
|
||||
|
||||
# Try to find an existing fuzzy group
|
||||
added_to_group = False
|
||||
|
||||
99907
cli/playlist_validation_results.json
Normal file
99907
cli/playlist_validation_results.json
Normal file
File diff suppressed because it is too large
Load Diff
@ -21,6 +21,7 @@ from utils import (
|
||||
extract_channel_from_path,
|
||||
get_file_extension,
|
||||
parse_multi_artist,
|
||||
clean_artist_name,
|
||||
validate_song_data
|
||||
)
|
||||
|
||||
@ -63,10 +64,15 @@ class PlaylistValidator:
|
||||
if not validate_song_data(song):
|
||||
continue
|
||||
|
||||
# Handle multi-artist songs
|
||||
artists = parse_multi_artist(song['artist'])
|
||||
# Clean and handle artist names
|
||||
cleaned_artist = clean_artist_name(song['artist'])
|
||||
if not cleaned_artist:
|
||||
cleaned_artist = song['artist'] # Fallback to original if cleaning fails
|
||||
|
||||
# Handle multi-artist songs (after cleaning)
|
||||
artists = parse_multi_artist(cleaned_artist)
|
||||
if not artists:
|
||||
artists = [song['artist']]
|
||||
artists = [cleaned_artist]
|
||||
|
||||
# Create exact match keys
|
||||
for artist in artists:
|
||||
|
||||
44
cli/utils.py
44
cli/utils.py
@ -218,6 +218,50 @@ def extract_consolidated_channel_from_path(file_path: str, channel_priorities: L
|
||||
return None
|
||||
|
||||
|
||||
def clean_artist_name(artist_string: str) -> str:
|
||||
"""Clean artist name by removing features, collaborations, etc."""
|
||||
if not artist_string:
|
||||
return ""
|
||||
|
||||
# Remove common feature/collaboration patterns (more precise)
|
||||
patterns_to_remove = [
|
||||
r'\s*feat\.?\s*.*$', # feat. anything after
|
||||
r'\s*ft\.?\s*.*$', # ft. anything after
|
||||
r'\s*featuring\s*.*$', # featuring anything after
|
||||
r'\s*with\s*.*$', # with anything after
|
||||
r'\s*presents\s*.*$', # presents anything after
|
||||
r'\s*featuring\s*.*$', # featuring anything after
|
||||
r'\s*feat\s*.*$', # feat anything after
|
||||
r'\s*ft\s*.*$', # ft anything after
|
||||
]
|
||||
|
||||
# Handle comma/semicolon/slash patterns more carefully
|
||||
# Only remove if they're followed by feature words
|
||||
separator_patterns = [
|
||||
r'\s*,\s*(feat\.?|ft\.?|featuring|with|presents).*$', # comma followed by feature words
|
||||
r'\s*;\s*(feat\.?|ft\.?|featuring|with|presents).*$', # semicolon followed by feature words
|
||||
r'\s*/\s*(feat\.?|ft\.?|featuring|with|presents).*$', # slash followed by feature words
|
||||
]
|
||||
|
||||
cleaned_artist = artist_string
|
||||
|
||||
# Apply feature removal patterns first
|
||||
for pattern in patterns_to_remove:
|
||||
cleaned_artist = re.sub(pattern, '', cleaned_artist, flags=re.IGNORECASE)
|
||||
|
||||
# Apply separator patterns only if they're followed by feature words
|
||||
for pattern in separator_patterns:
|
||||
cleaned_artist = re.sub(pattern, '', cleaned_artist, flags=re.IGNORECASE)
|
||||
|
||||
# Clean up any trailing separators that might be left
|
||||
cleaned_artist = re.sub(r'\s*[,;/]\s*$', '', cleaned_artist)
|
||||
|
||||
# Clean up extra whitespace
|
||||
cleaned_artist = re.sub(r'\s+', ' ', cleaned_artist).strip()
|
||||
|
||||
return cleaned_artist
|
||||
|
||||
|
||||
def parse_multi_artist(artist_string: str) -> List[str]:
|
||||
"""Parse multi-artist strings with various delimiters."""
|
||||
if not artist_string:
|
||||
|
||||
@ -5,7 +5,7 @@
|
||||
"Stingray Karaoke"
|
||||
],
|
||||
"matching": {
|
||||
"fuzzy_matching": false,
|
||||
"fuzzy_matching": true,
|
||||
"fuzzy_threshold": 0.85,
|
||||
"case_sensitive": false
|
||||
},
|
||||
|
||||
@ -1,16 +1,12 @@
|
||||
# Python dependencies for KaraokeMerge CLI tool
|
||||
|
||||
# Core dependencies (currently using only standard library)
|
||||
# No external dependencies required for basic functionality
|
||||
# Core dependencies
|
||||
flask>=2.0.0
|
||||
|
||||
# Optional dependencies for enhanced features:
|
||||
# Uncomment the following lines if you want to enable fuzzy matching:
|
||||
# Fuzzy matching dependencies (required for playlist validation)
|
||||
fuzzywuzzy>=0.18.0
|
||||
python-Levenshtein>=0.21.0
|
||||
|
||||
# For future enhancements:
|
||||
# pandas>=1.5.0 # For advanced data analysis
|
||||
# click>=8.0.0 # For enhanced CLI interface
|
||||
|
||||
# Web UI dependencies
|
||||
flask>=2.0.0
|
||||
# click>=8.0.0 # For enhanced CLI interface
|
||||
@ -10,21 +10,38 @@ import webbrowser
|
||||
from time import sleep
|
||||
|
||||
def check_dependencies():
|
||||
"""Check if Flask is installed."""
|
||||
"""Check if required dependencies are installed."""
|
||||
dependencies_ok = True
|
||||
|
||||
# Check Flask
|
||||
try:
|
||||
import flask
|
||||
print("✅ Flask is installed")
|
||||
return True
|
||||
except ImportError:
|
||||
print("❌ Flask is not installed")
|
||||
print("Installing Flask...")
|
||||
try:
|
||||
subprocess.check_call([sys.executable, "-m", "pip", "install", "flask>=2.0.0"])
|
||||
print("✅ Flask installed successfully")
|
||||
return True
|
||||
except subprocess.CalledProcessError:
|
||||
print("❌ Failed to install Flask")
|
||||
return False
|
||||
dependencies_ok = False
|
||||
|
||||
# Check fuzzywuzzy for playlist validation
|
||||
try:
|
||||
import fuzzywuzzy
|
||||
print("✅ fuzzywuzzy is installed (for playlist validation)")
|
||||
except ImportError:
|
||||
print("❌ fuzzywuzzy is not installed")
|
||||
print("Installing fuzzywuzzy and python-Levenshtein...")
|
||||
try:
|
||||
subprocess.check_call([sys.executable, "-m", "pip", "install", "fuzzywuzzy>=0.18.0", "python-Levenshtein>=0.21.0"])
|
||||
print("✅ fuzzywuzzy installed successfully")
|
||||
except subprocess.CalledProcessError:
|
||||
print("❌ Failed to install fuzzywuzzy")
|
||||
print("⚠️ Playlist validation will work without fuzzy matching")
|
||||
|
||||
return dependencies_ok
|
||||
|
||||
def check_data_files():
|
||||
"""Check if required data files exist."""
|
||||
|
||||
@ -1449,4 +1449,4 @@ def apply_all_updates():
|
||||
|
||||
|
||||
if __name__ == '__main__':
|
||||
app.run(debug=True, host='0.0.0.0', port=5000)
|
||||
app.run(debug=True, host='0.0.0.0', port=5001)
|
||||
Loading…
Reference in New Issue
Block a user