261 lines
7.1 KiB
Plaintext
261 lines
7.1 KiB
Plaintext
# Karaoke Song Library Cleanup Tool - CLI Commands Reference
|
|
|
|
## Overview
|
|
The CLI tool analyzes karaoke song collections, identifies duplicates, and generates skip lists for future imports. It supports multiple file formats (MP3, CDG, MP4) with configurable priority systems.
|
|
|
|
## Basic Usage
|
|
|
|
### Standard Analysis
|
|
```bash
|
|
python cli/main.py
|
|
```
|
|
Runs the tool with default settings:
|
|
- Input: `data/allSongs.json`
|
|
- Config: `config/config.json`
|
|
- Output: `data/skipSongs.json`
|
|
- Verbose: Disabled
|
|
- Reports: **Automatically generated** (including web UI data)
|
|
|
|
### Verbose Output
|
|
```bash
|
|
python cli/main.py --verbose
|
|
# or
|
|
python cli/main.py -v
|
|
```
|
|
Enables detailed output showing:
|
|
- Individual song processing
|
|
- Duplicate detection details
|
|
- File type analysis
|
|
- Channel priority decisions
|
|
|
|
### Dry Run Mode
|
|
```bash
|
|
python cli/main.py --dry-run
|
|
```
|
|
Analyzes songs without generating the skip list file. Useful for:
|
|
- Testing configuration changes
|
|
- Previewing results before committing
|
|
- Validating input data
|
|
|
|
## Configuration Options
|
|
|
|
### Custom Configuration File
|
|
```bash
|
|
python cli/main.py --config path/to/custom_config.json
|
|
```
|
|
Uses a custom configuration file instead of the default `config/config.json`.
|
|
|
|
### Show Current Configuration
|
|
```bash
|
|
python cli/main.py --show-config
|
|
```
|
|
Displays the current configuration settings and exits. Useful for:
|
|
- Verifying configuration values
|
|
- Debugging configuration issues
|
|
- Understanding current settings
|
|
|
|
## Input/Output Options
|
|
|
|
### Custom Input File
|
|
```bash
|
|
python cli/main.py --input path/to/songs.json
|
|
```
|
|
Specifies a custom input file instead of the default `data/allSongs.json`.
|
|
|
|
### Custom Output Directory
|
|
```bash
|
|
python cli/main.py --output-dir ./custom_output
|
|
```
|
|
Saves output files to a custom directory instead of the default `data/` folder.
|
|
|
|
## Report Generation
|
|
|
|
### Detailed Reports (Always Generated)
|
|
Reports are now **automatically generated** every time you run the CLI tool. The `--save-reports` flag is kept for backward compatibility but is no longer required.
|
|
|
|
Generated reports include:
|
|
- `enhanced_summary_report.txt` - Comprehensive analysis
|
|
- `channel_optimization_report.txt` - Priority optimization suggestions
|
|
- `duplicate_pattern_report.txt` - Duplicate pattern analysis
|
|
- `actionable_insights_report.txt` - Recommendations and insights
|
|
- `detailed_duplicate_analysis.txt` - Specific songs and their duplicates
|
|
- `analysis_data.json` - Raw analysis data for further processing
|
|
- `skip_songs_detailed.json` - **Web UI data (always generated)**
|
|
|
|
## Combined Examples
|
|
|
|
### Full Analysis with Reports
|
|
```bash
|
|
python cli/main.py --verbose
|
|
```
|
|
Runs complete analysis with:
|
|
- Verbose output for detailed processing information
|
|
- **Automatic comprehensive report generation**
|
|
- Skip list creation
|
|
|
|
### Custom Configuration with Dry Run
|
|
```bash
|
|
python cli/main.py --config custom_config.json --dry-run --verbose
|
|
```
|
|
Tests a custom configuration without generating files:
|
|
- Uses custom configuration
|
|
- Shows detailed processing
|
|
- No output files created
|
|
|
|
### Custom Input/Output with Reports
|
|
```bash
|
|
python cli/main.py --input /path/to/songs.json --output-dir ./reports
|
|
```
|
|
Processes custom input and saves all outputs to reports directory:
|
|
- Custom input file
|
|
- Custom output location
|
|
- **All report files automatically generated**
|
|
|
|
### Minimal Output
|
|
```bash
|
|
python cli/main.py --output-dir ./minimal
|
|
```
|
|
Runs with minimal output:
|
|
- No verbose logging
|
|
- No detailed reports
|
|
- Only generates skip list
|
|
|
|
## Configuration File Structure
|
|
|
|
The default configuration file (`config/config.json`) contains:
|
|
|
|
```json
|
|
{
|
|
"channel_priorities": [
|
|
"Sing King Karaoke",
|
|
"KaraFun Karaoke",
|
|
"Stingray Karaoke"
|
|
],
|
|
"matching": {
|
|
"fuzzy_matching": false,
|
|
"fuzzy_threshold": 0.85,
|
|
"case_sensitive": false
|
|
},
|
|
"output": {
|
|
"verbose": false,
|
|
"include_reasons": true,
|
|
"max_duplicates_per_song": 10
|
|
},
|
|
"file_types": {
|
|
"supported_extensions": [".mp3", ".cdg", ".mp4"],
|
|
"mp4_extensions": [".mp4"]
|
|
}
|
|
}
|
|
```
|
|
|
|
### Configuration Options Explained
|
|
|
|
#### Channel Priorities
|
|
- **channel_priorities**: Array of folder names for MP4 files
|
|
- Order determines priority (first = highest priority)
|
|
- Files without matching folders are marked for manual review
|
|
|
|
#### Matching Settings
|
|
- **fuzzy_matching**: Enable/disable fuzzy string matching
|
|
- **fuzzy_threshold**: Similarity threshold (0.0-1.0) for fuzzy matching
|
|
- **case_sensitive**: Case-sensitive artist/title comparison
|
|
|
|
#### Output Settings
|
|
- **verbose**: Enable detailed output
|
|
- **include_reasons**: Include reason field in skip list
|
|
- **max_duplicates_per_song**: Maximum duplicates to process per song
|
|
|
|
#### File Type Settings
|
|
- **supported_extensions**: All supported file extensions
|
|
- **mp4_extensions**: Extensions treated as MP4 files
|
|
|
|
## Input File Format
|
|
|
|
The tool expects a JSON array of song objects:
|
|
|
|
```json
|
|
[
|
|
{
|
|
"artist": "Artist Name",
|
|
"title": "Song Title",
|
|
"path": "path/to/file.mp3"
|
|
}
|
|
]
|
|
```
|
|
|
|
Optional fields for MP4 files:
|
|
- `channel`: Channel/folder information
|
|
- ID3 tag information (artist, title, etc.)
|
|
|
|
## Output Files
|
|
|
|
### Primary Output
|
|
- **skipSongs.json**: List of file paths to skip in future imports
|
|
- Format: `[{"path": "file/path.mp3", "reason": "duplicate"}]`
|
|
|
|
### Report Files (with --save-reports)
|
|
- **enhanced_summary_report.txt**: Overall analysis and statistics
|
|
- **channel_optimization_report.txt**: Channel priority suggestions
|
|
- **duplicate_pattern_report.txt**: Duplicate detection patterns
|
|
- **actionable_insights_report.txt**: Recommendations for collection management
|
|
- **detailed_duplicate_analysis.txt**: Specific duplicate groups
|
|
- **analysis_data.json**: Raw data for further processing
|
|
- **skip_songs_detailed.json**: Complete skip list with metadata
|
|
|
|
## File Type Priority System
|
|
|
|
The tool processes files in this priority order:
|
|
|
|
1. **MP4 files** (with channel priority sorting)
|
|
2. **CDG/MP3 pairs** (treated as single units)
|
|
3. **Standalone MP3** files
|
|
4. **Standalone CDG** files
|
|
|
|
## Error Handling
|
|
|
|
The tool provides clear error messages for:
|
|
- Missing input files
|
|
- Invalid JSON format
|
|
- Configuration errors
|
|
- File permission issues
|
|
- Processing errors
|
|
|
|
## Performance Notes
|
|
|
|
- Successfully tested with 37,000+ songs
|
|
- Processes large datasets efficiently
|
|
- Shows progress indicators for long operations
|
|
- Memory-efficient processing
|
|
|
|
## Integration with Web UI
|
|
|
|
The CLI tool integrates with the web UI:
|
|
- Web UI can load CLI-generated data
|
|
- Priority preferences from web UI are used by CLI
|
|
- Shared configuration and data files
|
|
- Consistent processing logic
|
|
|
|
## Troubleshooting
|
|
|
|
### Common Issues
|
|
1. **File not found**: Check input file path and permissions
|
|
2. **JSON errors**: Validate input file format
|
|
3. **Configuration errors**: Use --show-config to verify settings
|
|
4. **Permission errors**: Check output directory permissions
|
|
|
|
### Debug Mode
|
|
```bash
|
|
python cli/main.py --verbose --dry-run --show-config
|
|
```
|
|
Complete debugging setup:
|
|
- Shows configuration
|
|
- Verbose processing
|
|
- No file changes
|
|
|
|
## Version Information
|
|
|
|
This commands reference is for Karaoke Song Library Cleanup Tool v2.0
|
|
- CLI: Fully functional with comprehensive options
|
|
- Web UI: Interactive priority management
|
|
- Priority System: Drag-and-drop with persistence
|
|
- Reports: Enhanced analysis with actionable insights |