Signed-off-by: Matt Bruce <mbrucedogs@gmail.com>
This commit is contained in:
parent
d184724c70
commit
dd916a646a
10
PRD.md
10
PRD.md
@ -51,6 +51,7 @@ These principles are fundamental to the project's long-term success and must be
|
|||||||
- **CLI Commands Documentation:** All CLI functionality, options, and usage examples must be documented in `cli/commands.txt`
|
- **CLI Commands Documentation:** All CLI functionality, options, and usage examples must be documented in `cli/commands.txt`
|
||||||
- **Code Comments:** Significant logic changes should include inline documentation
|
- **Code Comments:** Significant logic changes should include inline documentation
|
||||||
- **API Documentation:** New endpoints, functions, or interfaces must be documented
|
- **API Documentation:** New endpoints, functions, or interfaces must be documented
|
||||||
|
- **API Update Requirement:** Whenever a new API endpoint is added, the PRD.md, README.md, and cli/commands.txt MUST be updated to reflect the new functionality
|
||||||
|
|
||||||
**Documentation Update Checklist:**
|
**Documentation Update Checklist:**
|
||||||
- [ ] Update PRD.md with any architectural or requirement changes
|
- [ ] Update PRD.md with any architectural or requirement changes
|
||||||
@ -59,6 +60,7 @@ These principles are fundamental to the project's long-term success and must be
|
|||||||
- [ ] Add inline comments for complex logic or business rules
|
- [ ] Add inline comments for complex logic or business rules
|
||||||
- [ ] Update any configuration examples or file structure documentation
|
- [ ] Update any configuration examples or file structure documentation
|
||||||
- [ ] Review and update implementation status sections
|
- [ ] Review and update implementation status sections
|
||||||
|
- [ ] **API Updates:** When new API endpoints are added, update PRD.md, README.md, and cli/commands.txt
|
||||||
|
|
||||||
**CLI Commands Documentation Requirements:**
|
**CLI Commands Documentation Requirements:**
|
||||||
- **Comprehensive Coverage:** All CLI arguments, options, and flags must be documented with examples
|
- **Comprehensive Coverage:** All CLI arguments, options, and flags must be documented with examples
|
||||||
@ -68,6 +70,14 @@ These principles are fundamental to the project's long-term success and must be
|
|||||||
- **Integration Notes:** Document how CLI integrates with web UI and other components
|
- **Integration Notes:** Document how CLI integrates with web UI and other components
|
||||||
- **Version Tracking:** Keep version information and feature status up to date
|
- **Version Tracking:** Keep version information and feature status up to date
|
||||||
|
|
||||||
|
**API Documentation Requirements:**
|
||||||
|
- **Endpoint Documentation:** All new API endpoints must be documented in the PRD.md with their purpose, parameters, and responses
|
||||||
|
- **README Integration:** API changes must be reflected in README.md with usage examples and integration notes
|
||||||
|
- **CLI Integration:** If CLI commands interact with APIs, they must be documented in cli/commands.txt
|
||||||
|
- **Version Tracking:** API versioning and changes must be tracked in documentation
|
||||||
|
- **Error Handling:** Document all possible error responses and status codes
|
||||||
|
- **Authentication:** Document any authentication requirements or API key usage
|
||||||
|
|
||||||
This documentation requirement is mandatory and ensures the project remains maintainable and accessible to future developers and users.
|
This documentation requirement is mandatory and ensures the project remains maintainable and accessible to future developers and users.
|
||||||
|
|
||||||
### 2.3 Code Quality & Development Standards
|
### 2.3 Code Quality & Development Standards
|
||||||
|
|||||||
40
README.md
40
README.md
@ -10,6 +10,7 @@ A comprehensive tool for analyzing, deduplicating, and cleaning up large karaoke
|
|||||||
- **CDG/MP3 Pairing**: Treats CDG and MP3 files with the same base filename as single karaoke units
|
- **CDG/MP3 Pairing**: Treats CDG and MP3 files with the same base filename as single karaoke units
|
||||||
- **Channel Priority**: For MP4 files, prioritizes based on folder names in the path
|
- **Channel Priority**: For MP4 files, prioritizes based on folder names in the path
|
||||||
- **Fuzzy Matching**: Configurable fuzzy matching for artist/title comparison
|
- **Fuzzy Matching**: Configurable fuzzy matching for artist/title comparison
|
||||||
|
- **Playlist Validation**: Validates playlists against your song library with exact and fuzzy matching
|
||||||
|
|
||||||
### File Type Priority System
|
### File Type Priority System
|
||||||
1. **MP4 files** (with channel priority sorting)
|
1. **MP4 files** (with channel priority sorting)
|
||||||
@ -32,12 +33,34 @@ A comprehensive tool for analyzing, deduplicating, and cleaning up large karaoke
|
|||||||
|
|
||||||
## Installation
|
## Installation
|
||||||
|
|
||||||
1. Clone the repository
|
### Prerequisites
|
||||||
|
|
||||||
|
- Python 3.7 or higher
|
||||||
|
- pip (Python package installer)
|
||||||
|
|
||||||
|
### Installation Steps
|
||||||
|
|
||||||
|
1. Clone the repository:
|
||||||
|
```bash
|
||||||
|
git clone <repository-url>
|
||||||
|
cd KaraokeMerge
|
||||||
|
```
|
||||||
|
|
||||||
2. Install dependencies:
|
2. Install dependencies:
|
||||||
```bash
|
```bash
|
||||||
pip install -r requirements.txt
|
pip install -r requirements.txt
|
||||||
```
|
```
|
||||||
|
|
||||||
|
**Note**: The installation includes:
|
||||||
|
- **Flask** for the web UI
|
||||||
|
- **fuzzywuzzy** and **python-Levenshtein** for fuzzy matching in playlist validation
|
||||||
|
- All other required dependencies
|
||||||
|
|
||||||
|
3. Verify installation:
|
||||||
|
```bash
|
||||||
|
python -c "import flask, fuzzywuzzy; print('All dependencies installed successfully!')"
|
||||||
|
```
|
||||||
|
|
||||||
## Usage
|
## Usage
|
||||||
|
|
||||||
### CLI Tool
|
### CLI Tool
|
||||||
@ -64,6 +87,21 @@ The web UI will automatically:
|
|||||||
2. Start the Flask server
|
2. Start the Flask server
|
||||||
3. Open your default browser to the interface
|
3. Open your default browser to the interface
|
||||||
|
|
||||||
|
### Playlist Validation
|
||||||
|
|
||||||
|
Validate your playlists against your song library:
|
||||||
|
```bash
|
||||||
|
cd cli
|
||||||
|
python playlist_validator.py
|
||||||
|
```
|
||||||
|
|
||||||
|
Options:
|
||||||
|
- `--playlist-index N`: Validate a specific playlist by index
|
||||||
|
- `--output results.json`: Save results to a JSON file
|
||||||
|
- `--apply`: Apply corrections to playlists (use with caution)
|
||||||
|
|
||||||
|
**Note**: Playlist validation uses fuzzy matching to find potential matches. Make sure fuzzywuzzy is installed for best results.
|
||||||
|
|
||||||
### Priority Preferences
|
### Priority Preferences
|
||||||
|
|
||||||
The web UI now supports drag-and-drop priority management:
|
The web UI now supports drag-and-drop priority management:
|
||||||
|
|||||||
443
cli/commands.txt
443
cli/commands.txt
@ -1,77 +1,117 @@
|
|||||||
# Karaoke Song Library Cleanup Tool - CLI Commands Reference
|
# Karaoke Song Library Cleanup Tool - CLI Commands Reference (v2.0)
|
||||||
|
|
||||||
## Overview
|
## Overview
|
||||||
The CLI tool analyzes karaoke song collections, identifies duplicates, and generates skip lists for future imports. It supports multiple file formats (MP3, CDG, MP4) with configurable priority systems.
|
The CLI tool analyzes karaoke song collections, identifies duplicates, validates playlists, and generates skip lists for future imports. It supports multiple file formats (MP3, CDG, MP4) with configurable priority systems.
|
||||||
|
|
||||||
## Basic Usage
|
## Quick Start Commands
|
||||||
|
|
||||||
### Standard Analysis
|
### Basic Analysis (Most Common)
|
||||||
```bash
|
```bash
|
||||||
python cli/main.py
|
cd cli
|
||||||
|
python3 main.py
|
||||||
```
|
```
|
||||||
Runs the tool with default settings:
|
Runs the tool with default settings:
|
||||||
- Input: `data/allSongs.json`
|
- Input: `data/allSongs.json`
|
||||||
- Config: `config/config.json`
|
- Config: `config/config.json`
|
||||||
- Output: `data/skipSongs.json`
|
- Output: `data/skipSongs.json`
|
||||||
- Verbose: Disabled
|
- Reports: **Automatically generated**
|
||||||
- Reports: **Automatically generated** (including web UI data)
|
|
||||||
|
|
||||||
### Verbose Output
|
### Process Everything (Recommended)
|
||||||
```bash
|
```bash
|
||||||
python cli/main.py --verbose
|
cd cli
|
||||||
|
python3 main.py --process-all
|
||||||
|
```
|
||||||
|
Complete processing including:
|
||||||
|
- Duplicate analysis and skip list generation
|
||||||
|
- Favorites processing with priority logic (MP4 over MP3)
|
||||||
|
- History processing with priority logic
|
||||||
|
- Comprehensive report generation
|
||||||
|
|
||||||
|
## Main CLI Commands (main.py)
|
||||||
|
|
||||||
|
### Basic Analysis Commands
|
||||||
|
|
||||||
|
#### Standard Analysis
|
||||||
|
```bash
|
||||||
|
python3 main.py
|
||||||
|
```
|
||||||
|
Runs the tool with default settings and generates all reports automatically.
|
||||||
|
|
||||||
|
#### Verbose Output
|
||||||
|
```bash
|
||||||
|
python3 main.py --verbose
|
||||||
# or
|
# or
|
||||||
python cli/main.py -v
|
python3 main.py -v
|
||||||
```
|
```
|
||||||
Enables detailed output showing:
|
Enables detailed output showing individual song processing and decisions.
|
||||||
- Individual song processing
|
|
||||||
- Duplicate detection details
|
|
||||||
- File type analysis
|
|
||||||
- Channel priority decisions
|
|
||||||
|
|
||||||
### Dry Run Mode
|
#### Dry Run Mode
|
||||||
```bash
|
```bash
|
||||||
python cli/main.py --dry-run
|
python3 main.py --dry-run
|
||||||
```
|
```
|
||||||
Analyzes songs without generating the skip list file. Useful for:
|
Analyzes songs without generating the skip list file. Useful for testing and previewing results.
|
||||||
- Testing configuration changes
|
|
||||||
- Previewing results before committing
|
|
||||||
- Validating input data
|
|
||||||
|
|
||||||
## Configuration Options
|
### Configuration Commands
|
||||||
|
|
||||||
### Custom Configuration File
|
#### Custom Configuration File
|
||||||
```bash
|
```bash
|
||||||
python cli/main.py --config path/to/custom_config.json
|
python3 main.py --config path/to/custom_config.json
|
||||||
```
|
```
|
||||||
Uses a custom configuration file instead of the default `config/config.json`.
|
Uses a custom configuration file instead of the default `config/config.json`.
|
||||||
|
|
||||||
### Show Current Configuration
|
#### Show Current Configuration
|
||||||
```bash
|
```bash
|
||||||
python cli/main.py --show-config
|
python3 main.py --show-config
|
||||||
```
|
```
|
||||||
Displays the current configuration settings and exits. Useful for:
|
Displays the current configuration settings and exits.
|
||||||
- Verifying configuration values
|
|
||||||
- Debugging configuration issues
|
|
||||||
- Understanding current settings
|
|
||||||
|
|
||||||
## Input/Output Options
|
### Input/Output Commands
|
||||||
|
|
||||||
### Custom Input File
|
#### Custom Input File
|
||||||
```bash
|
```bash
|
||||||
python cli/main.py --input path/to/songs.json
|
python3 main.py --input path/to/songs.json
|
||||||
```
|
```
|
||||||
Specifies a custom input file instead of the default `data/allSongs.json`.
|
Specifies a custom input file instead of the default `data/allSongs.json`.
|
||||||
|
|
||||||
### Custom Output Directory
|
#### Custom Output Directory
|
||||||
```bash
|
```bash
|
||||||
python cli/main.py --output-dir ./custom_output
|
python3 main.py --output-dir ./custom_output
|
||||||
```
|
```
|
||||||
Saves output files to a custom directory instead of the default `data/` folder.
|
Saves output files to a custom directory instead of the default `data/` folder.
|
||||||
|
|
||||||
## Report Generation
|
### Processing Commands
|
||||||
|
|
||||||
### Detailed Reports (Always Generated)
|
#### Process Favorites Only
|
||||||
Reports are now **automatically generated** every time you run the CLI tool. The `--save-reports` flag is kept for backward compatibility but is no longer required.
|
```bash
|
||||||
|
python3 main.py --process-favorites
|
||||||
|
```
|
||||||
|
Processes favorites with priority-based logic to select best versions (MP4 over MP3).
|
||||||
|
|
||||||
|
#### Process History Only
|
||||||
|
```bash
|
||||||
|
python3 main.py --process-history
|
||||||
|
```
|
||||||
|
Processes history with priority-based logic to select best versions (MP4 over MP3).
|
||||||
|
|
||||||
|
#### Process Everything
|
||||||
|
```bash
|
||||||
|
python3 main.py --process-all
|
||||||
|
```
|
||||||
|
Processes everything: duplicates, generates reports, AND updates favorites/history with priority logic.
|
||||||
|
|
||||||
|
#### Merge History Objects
|
||||||
|
```bash
|
||||||
|
python3 main.py --merge-history
|
||||||
|
```
|
||||||
|
Merges history objects that match on artist, title, and path, summing their count properties.
|
||||||
|
|
||||||
|
### Report Generation
|
||||||
|
|
||||||
|
#### Save Detailed Reports (Legacy)
|
||||||
|
```bash
|
||||||
|
python3 main.py --save-reports
|
||||||
|
```
|
||||||
|
**Note**: Reports are now automatically generated every time you run the CLI tool. This flag is kept for backward compatibility.
|
||||||
|
|
||||||
Generated reports include:
|
Generated reports include:
|
||||||
- `enhanced_summary_report.txt` - Comprehensive analysis
|
- `enhanced_summary_report.txt` - Comprehensive analysis
|
||||||
@ -82,43 +122,244 @@ Generated reports include:
|
|||||||
- `analysis_data.json` - Raw analysis data for further processing
|
- `analysis_data.json` - Raw analysis data for further processing
|
||||||
- `skip_songs_detailed.json` - **Web UI data (always generated)**
|
- `skip_songs_detailed.json` - **Web UI data (always generated)**
|
||||||
|
|
||||||
## Combined Examples
|
## Playlist Validator Commands (playlist_validator.py)
|
||||||
|
|
||||||
### Full Analysis with Reports
|
### Basic Playlist Validation
|
||||||
|
|
||||||
|
#### Validate All Playlists
|
||||||
```bash
|
```bash
|
||||||
python cli/main.py --verbose
|
python3 playlist_validator.py
|
||||||
```
|
```
|
||||||
Runs complete analysis with:
|
Validates all playlists in `data/songLists.json` against the song library.
|
||||||
|
|
||||||
|
#### Validate Specific Playlist
|
||||||
|
```bash
|
||||||
|
python3 playlist_validator.py --playlist-index 0
|
||||||
|
```
|
||||||
|
Validates a specific playlist by index (0-based).
|
||||||
|
|
||||||
|
### Playlist Validator Options
|
||||||
|
|
||||||
|
#### Custom Configuration
|
||||||
|
```bash
|
||||||
|
python3 playlist_validator.py --config path/to/custom_config.json
|
||||||
|
```
|
||||||
|
Uses a custom configuration file.
|
||||||
|
|
||||||
|
#### Custom Data Directory
|
||||||
|
```bash
|
||||||
|
python3 playlist_validator.py --data-dir path/to/data
|
||||||
|
```
|
||||||
|
Uses a custom data directory.
|
||||||
|
|
||||||
|
#### Apply Changes (Disable Dry Run)
|
||||||
|
```bash
|
||||||
|
python3 playlist_validator.py --apply
|
||||||
|
```
|
||||||
|
Applies changes to playlists instead of just previewing them.
|
||||||
|
|
||||||
|
#### Output Results to File
|
||||||
|
```bash
|
||||||
|
python3 playlist_validator.py --output results.json
|
||||||
|
```
|
||||||
|
Saves validation results to a JSON file.
|
||||||
|
|
||||||
|
## Comprehensive Examples
|
||||||
|
|
||||||
|
### Complete Workflow Examples
|
||||||
|
|
||||||
|
#### 1. Full Analysis with Everything
|
||||||
|
```bash
|
||||||
|
cd cli
|
||||||
|
python3 main.py --process-all --verbose
|
||||||
|
```
|
||||||
|
Complete processing with detailed output:
|
||||||
|
- Duplicate analysis and skip list generation
|
||||||
|
- Favorites and history processing with priority logic
|
||||||
|
- Comprehensive report generation
|
||||||
- Verbose output for detailed processing information
|
- Verbose output for detailed processing information
|
||||||
- **Automatic comprehensive report generation**
|
|
||||||
- Skip list creation
|
|
||||||
|
|
||||||
### Custom Configuration with Dry Run
|
#### 2. Preview Changes Before Applying
|
||||||
```bash
|
```bash
|
||||||
python cli/main.py --config custom_config.json --dry-run --verbose
|
cd cli
|
||||||
|
python3 main.py --process-all --dry-run --verbose
|
||||||
```
|
```
|
||||||
Tests a custom configuration without generating files:
|
Preview all changes without saving:
|
||||||
- Uses custom configuration
|
- Shows what would be processed
|
||||||
|
- No files are modified
|
||||||
|
- Useful for testing configuration changes
|
||||||
|
|
||||||
|
#### 3. Custom Configuration Testing
|
||||||
|
```bash
|
||||||
|
cd cli
|
||||||
|
python3 main.py --config custom_config.json --dry-run --verbose
|
||||||
|
```
|
||||||
|
Test a custom configuration:
|
||||||
|
- Uses custom configuration file
|
||||||
- Shows detailed processing
|
- Shows detailed processing
|
||||||
- No output files created
|
- No output files created
|
||||||
|
|
||||||
### Custom Input/Output with Reports
|
#### 4. Process Only Favorites and History
|
||||||
```bash
|
```bash
|
||||||
python cli/main.py --input /path/to/songs.json --output-dir ./reports
|
cd cli
|
||||||
|
python3 main.py --process-favorites --process-history
|
||||||
|
```
|
||||||
|
Process only favorites and history files:
|
||||||
|
- Updates favorites with best versions (MP4 over MP3)
|
||||||
|
- Updates history with best versions
|
||||||
|
- No duplicate analysis performed
|
||||||
|
|
||||||
|
#### 5. Merge History Objects
|
||||||
|
```bash
|
||||||
|
cd cli
|
||||||
|
python3 main.py --merge-history --dry-run
|
||||||
|
```
|
||||||
|
Preview history merging:
|
||||||
|
- Shows which history objects would be merged
|
||||||
|
- No files are modified
|
||||||
|
|
||||||
|
#### 6. Apply History Merging
|
||||||
|
```bash
|
||||||
|
cd cli
|
||||||
|
python3 main.py --merge-history
|
||||||
|
```
|
||||||
|
Actually merge history objects:
|
||||||
|
- Combines duplicate history entries
|
||||||
|
- Sums count properties
|
||||||
|
- Saves updated history file
|
||||||
|
|
||||||
|
### Playlist Validation Examples
|
||||||
|
|
||||||
|
#### 1. Validate All Playlists
|
||||||
|
```bash
|
||||||
|
cd cli
|
||||||
|
python3 playlist_validator.py
|
||||||
|
```
|
||||||
|
Validates all playlists and shows summary:
|
||||||
|
- Total playlists and songs
|
||||||
|
- Exact matches found
|
||||||
|
- Missing songs count
|
||||||
|
- Fuzzy matches (if available)
|
||||||
|
|
||||||
|
#### 2. Validate Specific Playlist
|
||||||
|
```bash
|
||||||
|
cd cli
|
||||||
|
python3 playlist_validator.py --playlist-index 5
|
||||||
|
```
|
||||||
|
Validates playlist at index 5:
|
||||||
|
- Shows detailed results for that specific playlist
|
||||||
|
- Lists exact matches and missing songs
|
||||||
|
|
||||||
|
#### 3. Save Validation Results
|
||||||
|
```bash
|
||||||
|
cd cli
|
||||||
|
python3 playlist_validator.py --output validation_results.json
|
||||||
|
```
|
||||||
|
Saves detailed validation results to JSON file for further analysis.
|
||||||
|
|
||||||
|
#### 4. Apply Playlist Corrections
|
||||||
|
```bash
|
||||||
|
cd cli
|
||||||
|
python3 playlist_validator.py --apply
|
||||||
|
```
|
||||||
|
Applies corrections to playlists (use with caution).
|
||||||
|
|
||||||
|
### Advanced Examples
|
||||||
|
|
||||||
|
#### 1. Custom Input/Output with Full Processing
|
||||||
|
```bash
|
||||||
|
cd cli
|
||||||
|
python3 main.py --input /path/to/songs.json --output-dir ./reports --process-all --verbose
|
||||||
```
|
```
|
||||||
Processes custom input and saves all outputs to reports directory:
|
Processes custom input and saves all outputs to reports directory:
|
||||||
- Custom input file
|
- Custom input file
|
||||||
- Custom output location
|
- Custom output location
|
||||||
- **All report files automatically generated**
|
- Full processing including favorites/history
|
||||||
|
- Verbose output
|
||||||
|
|
||||||
### Minimal Output
|
#### 2. Configuration Testing Workflow
|
||||||
```bash
|
```bash
|
||||||
python cli/main.py --output-dir ./minimal
|
cd cli
|
||||||
|
# Show current configuration
|
||||||
|
python3 main.py --show-config
|
||||||
|
|
||||||
|
# Test with dry run
|
||||||
|
python3 main.py --dry-run --verbose
|
||||||
|
|
||||||
|
# Test with custom config
|
||||||
|
python3 main.py --config test_config.json --dry-run --verbose
|
||||||
```
|
```
|
||||||
Runs with minimal output:
|
|
||||||
- No verbose logging
|
#### 3. Playlist Analysis Workflow
|
||||||
- No detailed reports
|
```bash
|
||||||
- Only generates skip list
|
cd cli
|
||||||
|
# Validate all playlists
|
||||||
|
python3 playlist_validator.py
|
||||||
|
|
||||||
|
# Validate specific playlist
|
||||||
|
python3 playlist_validator.py --playlist-index 0
|
||||||
|
|
||||||
|
# Save detailed results
|
||||||
|
python3 playlist_validator.py --output playlist_analysis.json
|
||||||
|
```
|
||||||
|
|
||||||
|
#### 4. Complete System Analysis
|
||||||
|
```bash
|
||||||
|
cd cli
|
||||||
|
# Process everything
|
||||||
|
python3 main.py --process-all --verbose
|
||||||
|
|
||||||
|
# Validate playlists
|
||||||
|
python3 playlist_validator.py
|
||||||
|
|
||||||
|
# Show configuration
|
||||||
|
python3 main.py --show-config
|
||||||
|
```
|
||||||
|
|
||||||
|
## Command Line Options Reference
|
||||||
|
|
||||||
|
### Main CLI (main.py) Options
|
||||||
|
|
||||||
|
| Option | Description | Default |
|
||||||
|
|--------|-------------|---------|
|
||||||
|
| `--config` | Configuration file path | `../config/config.json` |
|
||||||
|
| `--input` | Input songs file path | `../data/allSongs.json` |
|
||||||
|
| `--output-dir` | Output directory | `../data` |
|
||||||
|
| `--verbose, -v` | Enable verbose output | `False` |
|
||||||
|
| `--dry-run` | Analyze without generating files | `False` |
|
||||||
|
| `--save-reports` | Save detailed reports | `True` (always enabled) |
|
||||||
|
| `--show-config` | Show configuration and exit | `False` |
|
||||||
|
| `--process-favorites` | Process favorites with priority logic | `False` |
|
||||||
|
| `--process-history` | Process history with priority logic | `False` |
|
||||||
|
| `--process-all` | Process everything | `False` |
|
||||||
|
| `--merge-history` | Merge history objects | `False` |
|
||||||
|
|
||||||
|
### Playlist Validator (playlist_validator.py) Options
|
||||||
|
|
||||||
|
| Option | Description | Default |
|
||||||
|
|--------|-------------|---------|
|
||||||
|
| `--config` | Configuration file path | `../config/config.json` |
|
||||||
|
| `--data-dir` | Data directory path | `../data` |
|
||||||
|
| `--dry-run` | Dry run mode | `True` |
|
||||||
|
| `--apply` | Apply changes (disable dry run) | `False` |
|
||||||
|
| `--playlist-index` | Validate specific playlist by index | `None` |
|
||||||
|
| `--output` | Output results to JSON file | `None` |
|
||||||
|
|
||||||
|
## File Structure Requirements
|
||||||
|
|
||||||
|
### Required Files
|
||||||
|
- `data/allSongs.json` - Main song library
|
||||||
|
- `config/config.json` - Configuration settings
|
||||||
|
|
||||||
|
### Optional Files
|
||||||
|
- `data/favorites.json` - Favorites list (for processing)
|
||||||
|
- `data/history.json` - History list (for processing)
|
||||||
|
- `data/songLists.json` - Playlists (for validation)
|
||||||
|
|
||||||
|
### Generated Files
|
||||||
|
- `data/skipSongs.json` - Skip list for future imports
|
||||||
|
- `data/reports/` - Directory containing all analysis reports
|
||||||
|
- `data/preferences/` - Directory containing priority preferences
|
||||||
|
|
||||||
## Configuration File Structure
|
## Configuration File Structure
|
||||||
|
|
||||||
@ -148,31 +389,9 @@ The default configuration file (`config/config.json`) contains:
|
|||||||
}
|
}
|
||||||
```
|
```
|
||||||
|
|
||||||
### Configuration Options Explained
|
## Input File Formats
|
||||||
|
|
||||||
#### Channel Priorities
|
|
||||||
- **channel_priorities**: Array of folder names for MP4 files
|
|
||||||
- Order determines priority (first = highest priority)
|
|
||||||
- Files without matching folders are marked for manual review
|
|
||||||
|
|
||||||
#### Matching Settings
|
|
||||||
- **fuzzy_matching**: Enable/disable fuzzy string matching
|
|
||||||
- **fuzzy_threshold**: Similarity threshold (0.0-1.0) for fuzzy matching
|
|
||||||
- **case_sensitive**: Case-sensitive artist/title comparison
|
|
||||||
|
|
||||||
#### Output Settings
|
|
||||||
- **verbose**: Enable detailed output
|
|
||||||
- **include_reasons**: Include reason field in skip list
|
|
||||||
- **max_duplicates_per_song**: Maximum duplicates to process per song
|
|
||||||
|
|
||||||
#### File Type Settings
|
|
||||||
- **supported_extensions**: All supported file extensions
|
|
||||||
- **mp4_extensions**: Extensions treated as MP4 files
|
|
||||||
|
|
||||||
## Input File Format
|
|
||||||
|
|
||||||
The tool expects a JSON array of song objects:
|
|
||||||
|
|
||||||
|
### Song Library Format (allSongs.json)
|
||||||
```json
|
```json
|
||||||
[
|
[
|
||||||
{
|
{
|
||||||
@ -183,9 +402,45 @@ The tool expects a JSON array of song objects:
|
|||||||
]
|
]
|
||||||
```
|
```
|
||||||
|
|
||||||
Optional fields for MP4 files:
|
### Playlist Format (songLists.json)
|
||||||
- `channel`: Channel/folder information
|
```json
|
||||||
- ID3 tag information (artist, title, etc.)
|
[
|
||||||
|
{
|
||||||
|
"title": "Playlist Name",
|
||||||
|
"songs": [
|
||||||
|
{
|
||||||
|
"position": 1,
|
||||||
|
"artist": "Artist Name",
|
||||||
|
"title": "Song Title"
|
||||||
|
}
|
||||||
|
]
|
||||||
|
}
|
||||||
|
]
|
||||||
|
```
|
||||||
|
|
||||||
|
### Favorites Format (favorites.json)
|
||||||
|
```json
|
||||||
|
[
|
||||||
|
{
|
||||||
|
"artist": "Artist Name",
|
||||||
|
"title": "Song Title",
|
||||||
|
"path": "path/to/file.mp3",
|
||||||
|
"favorite": true
|
||||||
|
}
|
||||||
|
]
|
||||||
|
```
|
||||||
|
|
||||||
|
### History Format (history.json)
|
||||||
|
```json
|
||||||
|
[
|
||||||
|
{
|
||||||
|
"artist": "Artist Name",
|
||||||
|
"title": "Song Title",
|
||||||
|
"path": "path/to/file.mp3",
|
||||||
|
"count": 5
|
||||||
|
}
|
||||||
|
]
|
||||||
|
```
|
||||||
|
|
||||||
## Output Files
|
## Output Files
|
||||||
|
|
||||||
@ -193,7 +448,7 @@ Optional fields for MP4 files:
|
|||||||
- **skipSongs.json**: List of file paths to skip in future imports
|
- **skipSongs.json**: List of file paths to skip in future imports
|
||||||
- Format: `[{"path": "file/path.mp3", "reason": "duplicate"}]`
|
- Format: `[{"path": "file/path.mp3", "reason": "duplicate"}]`
|
||||||
|
|
||||||
### Report Files (with --save-reports)
|
### Report Files (Automatically Generated)
|
||||||
- **enhanced_summary_report.txt**: Overall analysis and statistics
|
- **enhanced_summary_report.txt**: Overall analysis and statistics
|
||||||
- **channel_optimization_report.txt**: Channel priority suggestions
|
- **channel_optimization_report.txt**: Channel priority suggestions
|
||||||
- **duplicate_pattern_report.txt**: Duplicate detection patterns
|
- **duplicate_pattern_report.txt**: Duplicate detection patterns
|
||||||
@ -222,7 +477,7 @@ The tool provides clear error messages for:
|
|||||||
|
|
||||||
## Performance Notes
|
## Performance Notes
|
||||||
|
|
||||||
- Successfully tested with 37,000+ songs
|
- Successfully tested with 49,000+ songs
|
||||||
- Processes large datasets efficiently
|
- Processes large datasets efficiently
|
||||||
- Shows progress indicators for long operations
|
- Shows progress indicators for long operations
|
||||||
- Memory-efficient processing
|
- Memory-efficient processing
|
||||||
@ -245,17 +500,29 @@ The CLI tool integrates with the web UI:
|
|||||||
|
|
||||||
### Debug Mode
|
### Debug Mode
|
||||||
```bash
|
```bash
|
||||||
python cli/main.py --verbose --dry-run --show-config
|
cd cli
|
||||||
|
python3 main.py --verbose --dry-run --show-config
|
||||||
```
|
```
|
||||||
Complete debugging setup:
|
Complete debugging setup:
|
||||||
- Shows configuration
|
- Shows configuration
|
||||||
- Verbose processing
|
- Verbose processing
|
||||||
- No file changes
|
- No file changes
|
||||||
|
|
||||||
|
### Playlist Validator Debug
|
||||||
|
```bash
|
||||||
|
cd cli
|
||||||
|
python3 playlist_validator.py --dry-run --output debug_results.json
|
||||||
|
```
|
||||||
|
Debug playlist validation:
|
||||||
|
- Dry run mode
|
||||||
|
- Save results to file
|
||||||
|
- No playlist modifications
|
||||||
|
|
||||||
## Version Information
|
## Version Information
|
||||||
|
|
||||||
This commands reference is for Karaoke Song Library Cleanup Tool v2.0
|
This commands reference is for Karaoke Song Library Cleanup Tool v2.0
|
||||||
- CLI: Fully functional with comprehensive options
|
- CLI: Fully functional with comprehensive options
|
||||||
- Web UI: Interactive priority management
|
- Web UI: Interactive priority management
|
||||||
- Priority System: Drag-and-drop with persistence
|
- Priority System: Drag-and-drop with persistence
|
||||||
- Reports: Enhanced analysis with actionable insights
|
- Reports: Enhanced analysis with actionable insights
|
||||||
|
- Playlist Validator: Complete playlist analysis and validation
|
||||||
124359
cli/complete_playlist_validation.json
Normal file
124359
cli/complete_playlist_validation.json
Normal file
File diff suppressed because it is too large
Load Diff
98157
cli/final_playlist_validation.json
Normal file
98157
cli/final_playlist_validation.json
Normal file
File diff suppressed because it is too large
Load Diff
@ -17,6 +17,7 @@ from utils import (
|
|||||||
extract_consolidated_channel_from_path,
|
extract_consolidated_channel_from_path,
|
||||||
get_file_extension,
|
get_file_extension,
|
||||||
parse_multi_artist,
|
parse_multi_artist,
|
||||||
|
clean_artist_name,
|
||||||
validate_song_data,
|
validate_song_data,
|
||||||
find_mp3_pairs
|
find_mp3_pairs
|
||||||
)
|
)
|
||||||
@ -63,10 +64,15 @@ class SongMatcher:
|
|||||||
if not validate_song_data(song):
|
if not validate_song_data(song):
|
||||||
continue
|
continue
|
||||||
|
|
||||||
# Handle multi-artist songs
|
# Clean and handle artist names
|
||||||
artists = parse_multi_artist(song['artist'])
|
cleaned_artist = clean_artist_name(song['artist'])
|
||||||
|
if not cleaned_artist:
|
||||||
|
cleaned_artist = song['artist'] # Fallback to original if cleaning fails
|
||||||
|
|
||||||
|
# Handle multi-artist songs (after cleaning)
|
||||||
|
artists = parse_multi_artist(cleaned_artist)
|
||||||
if not artists:
|
if not artists:
|
||||||
artists = [song['artist']]
|
artists = [cleaned_artist]
|
||||||
|
|
||||||
# Create groups for each artist variation
|
# Create groups for each artist variation
|
||||||
for artist in artists:
|
for artist in artists:
|
||||||
@ -90,10 +96,15 @@ class SongMatcher:
|
|||||||
if i % 1000 == 0 and i > 0:
|
if i % 1000 == 0 and i > 0:
|
||||||
print(f"Processing song {i:,}/{len(songs):,}...")
|
print(f"Processing song {i:,}/{len(songs):,}...")
|
||||||
|
|
||||||
# Handle multi-artist songs
|
# Clean and handle artist names
|
||||||
artists = parse_multi_artist(song['artist'])
|
cleaned_artist = clean_artist_name(song['artist'])
|
||||||
|
if not cleaned_artist:
|
||||||
|
cleaned_artist = song['artist'] # Fallback to original if cleaning fails
|
||||||
|
|
||||||
|
# Handle multi-artist songs (after cleaning)
|
||||||
|
artists = parse_multi_artist(cleaned_artist)
|
||||||
if not artists:
|
if not artists:
|
||||||
artists = [song['artist']]
|
artists = [cleaned_artist]
|
||||||
|
|
||||||
# Try exact matching first
|
# Try exact matching first
|
||||||
added_to_exact = False
|
added_to_exact = False
|
||||||
@ -117,10 +128,15 @@ class SongMatcher:
|
|||||||
if i % 100 == 0 and i > 0:
|
if i % 100 == 0 and i > 0:
|
||||||
print(f"Fuzzy matching song {i:,}/{len(ungrouped_songs):,}...")
|
print(f"Fuzzy matching song {i:,}/{len(ungrouped_songs):,}...")
|
||||||
|
|
||||||
# Handle multi-artist songs
|
# Clean and handle artist names
|
||||||
artists = parse_multi_artist(song['artist'])
|
cleaned_artist = clean_artist_name(song['artist'])
|
||||||
|
if not cleaned_artist:
|
||||||
|
cleaned_artist = song['artist'] # Fallback to original if cleaning fails
|
||||||
|
|
||||||
|
# Handle multi-artist songs (after cleaning)
|
||||||
|
artists = parse_multi_artist(cleaned_artist)
|
||||||
if not artists:
|
if not artists:
|
||||||
artists = [song['artist']]
|
artists = [cleaned_artist]
|
||||||
|
|
||||||
# Try to find an existing fuzzy group
|
# Try to find an existing fuzzy group
|
||||||
added_to_group = False
|
added_to_group = False
|
||||||
|
|||||||
99907
cli/playlist_validation_results.json
Normal file
99907
cli/playlist_validation_results.json
Normal file
File diff suppressed because it is too large
Load Diff
@ -21,6 +21,7 @@ from utils import (
|
|||||||
extract_channel_from_path,
|
extract_channel_from_path,
|
||||||
get_file_extension,
|
get_file_extension,
|
||||||
parse_multi_artist,
|
parse_multi_artist,
|
||||||
|
clean_artist_name,
|
||||||
validate_song_data
|
validate_song_data
|
||||||
)
|
)
|
||||||
|
|
||||||
@ -63,10 +64,15 @@ class PlaylistValidator:
|
|||||||
if not validate_song_data(song):
|
if not validate_song_data(song):
|
||||||
continue
|
continue
|
||||||
|
|
||||||
# Handle multi-artist songs
|
# Clean and handle artist names
|
||||||
artists = parse_multi_artist(song['artist'])
|
cleaned_artist = clean_artist_name(song['artist'])
|
||||||
|
if not cleaned_artist:
|
||||||
|
cleaned_artist = song['artist'] # Fallback to original if cleaning fails
|
||||||
|
|
||||||
|
# Handle multi-artist songs (after cleaning)
|
||||||
|
artists = parse_multi_artist(cleaned_artist)
|
||||||
if not artists:
|
if not artists:
|
||||||
artists = [song['artist']]
|
artists = [cleaned_artist]
|
||||||
|
|
||||||
# Create exact match keys
|
# Create exact match keys
|
||||||
for artist in artists:
|
for artist in artists:
|
||||||
|
|||||||
44
cli/utils.py
44
cli/utils.py
@ -218,6 +218,50 @@ def extract_consolidated_channel_from_path(file_path: str, channel_priorities: L
|
|||||||
return None
|
return None
|
||||||
|
|
||||||
|
|
||||||
|
def clean_artist_name(artist_string: str) -> str:
|
||||||
|
"""Clean artist name by removing features, collaborations, etc."""
|
||||||
|
if not artist_string:
|
||||||
|
return ""
|
||||||
|
|
||||||
|
# Remove common feature/collaboration patterns (more precise)
|
||||||
|
patterns_to_remove = [
|
||||||
|
r'\s*feat\.?\s*.*$', # feat. anything after
|
||||||
|
r'\s*ft\.?\s*.*$', # ft. anything after
|
||||||
|
r'\s*featuring\s*.*$', # featuring anything after
|
||||||
|
r'\s*with\s*.*$', # with anything after
|
||||||
|
r'\s*presents\s*.*$', # presents anything after
|
||||||
|
r'\s*featuring\s*.*$', # featuring anything after
|
||||||
|
r'\s*feat\s*.*$', # feat anything after
|
||||||
|
r'\s*ft\s*.*$', # ft anything after
|
||||||
|
]
|
||||||
|
|
||||||
|
# Handle comma/semicolon/slash patterns more carefully
|
||||||
|
# Only remove if they're followed by feature words
|
||||||
|
separator_patterns = [
|
||||||
|
r'\s*,\s*(feat\.?|ft\.?|featuring|with|presents).*$', # comma followed by feature words
|
||||||
|
r'\s*;\s*(feat\.?|ft\.?|featuring|with|presents).*$', # semicolon followed by feature words
|
||||||
|
r'\s*/\s*(feat\.?|ft\.?|featuring|with|presents).*$', # slash followed by feature words
|
||||||
|
]
|
||||||
|
|
||||||
|
cleaned_artist = artist_string
|
||||||
|
|
||||||
|
# Apply feature removal patterns first
|
||||||
|
for pattern in patterns_to_remove:
|
||||||
|
cleaned_artist = re.sub(pattern, '', cleaned_artist, flags=re.IGNORECASE)
|
||||||
|
|
||||||
|
# Apply separator patterns only if they're followed by feature words
|
||||||
|
for pattern in separator_patterns:
|
||||||
|
cleaned_artist = re.sub(pattern, '', cleaned_artist, flags=re.IGNORECASE)
|
||||||
|
|
||||||
|
# Clean up any trailing separators that might be left
|
||||||
|
cleaned_artist = re.sub(r'\s*[,;/]\s*$', '', cleaned_artist)
|
||||||
|
|
||||||
|
# Clean up extra whitespace
|
||||||
|
cleaned_artist = re.sub(r'\s+', ' ', cleaned_artist).strip()
|
||||||
|
|
||||||
|
return cleaned_artist
|
||||||
|
|
||||||
|
|
||||||
def parse_multi_artist(artist_string: str) -> List[str]:
|
def parse_multi_artist(artist_string: str) -> List[str]:
|
||||||
"""Parse multi-artist strings with various delimiters."""
|
"""Parse multi-artist strings with various delimiters."""
|
||||||
if not artist_string:
|
if not artist_string:
|
||||||
|
|||||||
@ -5,7 +5,7 @@
|
|||||||
"Stingray Karaoke"
|
"Stingray Karaoke"
|
||||||
],
|
],
|
||||||
"matching": {
|
"matching": {
|
||||||
"fuzzy_matching": false,
|
"fuzzy_matching": true,
|
||||||
"fuzzy_threshold": 0.85,
|
"fuzzy_threshold": 0.85,
|
||||||
"case_sensitive": false
|
"case_sensitive": false
|
||||||
},
|
},
|
||||||
|
|||||||
@ -1,16 +1,12 @@
|
|||||||
# Python dependencies for KaraokeMerge CLI tool
|
# Python dependencies for KaraokeMerge CLI tool
|
||||||
|
|
||||||
# Core dependencies (currently using only standard library)
|
# Core dependencies
|
||||||
# No external dependencies required for basic functionality
|
flask>=2.0.0
|
||||||
|
|
||||||
# Optional dependencies for enhanced features:
|
# Fuzzy matching dependencies (required for playlist validation)
|
||||||
# Uncomment the following lines if you want to enable fuzzy matching:
|
|
||||||
fuzzywuzzy>=0.18.0
|
fuzzywuzzy>=0.18.0
|
||||||
python-Levenshtein>=0.21.0
|
python-Levenshtein>=0.21.0
|
||||||
|
|
||||||
# For future enhancements:
|
# For future enhancements:
|
||||||
# pandas>=1.5.0 # For advanced data analysis
|
# pandas>=1.5.0 # For advanced data analysis
|
||||||
# click>=8.0.0 # For enhanced CLI interface
|
# click>=8.0.0 # For enhanced CLI interface
|
||||||
|
|
||||||
# Web UI dependencies
|
|
||||||
flask>=2.0.0
|
|
||||||
@ -10,21 +10,38 @@ import webbrowser
|
|||||||
from time import sleep
|
from time import sleep
|
||||||
|
|
||||||
def check_dependencies():
|
def check_dependencies():
|
||||||
"""Check if Flask is installed."""
|
"""Check if required dependencies are installed."""
|
||||||
|
dependencies_ok = True
|
||||||
|
|
||||||
|
# Check Flask
|
||||||
try:
|
try:
|
||||||
import flask
|
import flask
|
||||||
print("✅ Flask is installed")
|
print("✅ Flask is installed")
|
||||||
return True
|
|
||||||
except ImportError:
|
except ImportError:
|
||||||
print("❌ Flask is not installed")
|
print("❌ Flask is not installed")
|
||||||
print("Installing Flask...")
|
print("Installing Flask...")
|
||||||
try:
|
try:
|
||||||
subprocess.check_call([sys.executable, "-m", "pip", "install", "flask>=2.0.0"])
|
subprocess.check_call([sys.executable, "-m", "pip", "install", "flask>=2.0.0"])
|
||||||
print("✅ Flask installed successfully")
|
print("✅ Flask installed successfully")
|
||||||
return True
|
|
||||||
except subprocess.CalledProcessError:
|
except subprocess.CalledProcessError:
|
||||||
print("❌ Failed to install Flask")
|
print("❌ Failed to install Flask")
|
||||||
return False
|
dependencies_ok = False
|
||||||
|
|
||||||
|
# Check fuzzywuzzy for playlist validation
|
||||||
|
try:
|
||||||
|
import fuzzywuzzy
|
||||||
|
print("✅ fuzzywuzzy is installed (for playlist validation)")
|
||||||
|
except ImportError:
|
||||||
|
print("❌ fuzzywuzzy is not installed")
|
||||||
|
print("Installing fuzzywuzzy and python-Levenshtein...")
|
||||||
|
try:
|
||||||
|
subprocess.check_call([sys.executable, "-m", "pip", "install", "fuzzywuzzy>=0.18.0", "python-Levenshtein>=0.21.0"])
|
||||||
|
print("✅ fuzzywuzzy installed successfully")
|
||||||
|
except subprocess.CalledProcessError:
|
||||||
|
print("❌ Failed to install fuzzywuzzy")
|
||||||
|
print("⚠️ Playlist validation will work without fuzzy matching")
|
||||||
|
|
||||||
|
return dependencies_ok
|
||||||
|
|
||||||
def check_data_files():
|
def check_data_files():
|
||||||
"""Check if required data files exist."""
|
"""Check if required data files exist."""
|
||||||
|
|||||||
@ -1449,4 +1449,4 @@ def apply_all_updates():
|
|||||||
|
|
||||||
|
|
||||||
if __name__ == '__main__':
|
if __name__ == '__main__':
|
||||||
app.run(debug=True, host='0.0.0.0', port=5000)
|
app.run(debug=True, host='0.0.0.0', port=5001)
|
||||||
Loading…
Reference in New Issue
Block a user