KaraokeMerge/cli/commands.txt

261 lines
7.1 KiB
Plaintext

# Karaoke Song Library Cleanup Tool - CLI Commands Reference
## Overview
The CLI tool analyzes karaoke song collections, identifies duplicates, and generates skip lists for future imports. It supports multiple file formats (MP3, CDG, MP4) with configurable priority systems.
## Basic Usage
### Standard Analysis
```bash
python cli/main.py
```
Runs the tool with default settings:
- Input: `data/allSongs.json`
- Config: `config/config.json`
- Output: `data/skipSongs.json`
- Verbose: Disabled
- Reports: **Automatically generated** (including web UI data)
### Verbose Output
```bash
python cli/main.py --verbose
# or
python cli/main.py -v
```
Enables detailed output showing:
- Individual song processing
- Duplicate detection details
- File type analysis
- Channel priority decisions
### Dry Run Mode
```bash
python cli/main.py --dry-run
```
Analyzes songs without generating the skip list file. Useful for:
- Testing configuration changes
- Previewing results before committing
- Validating input data
## Configuration Options
### Custom Configuration File
```bash
python cli/main.py --config path/to/custom_config.json
```
Uses a custom configuration file instead of the default `config/config.json`.
### Show Current Configuration
```bash
python cli/main.py --show-config
```
Displays the current configuration settings and exits. Useful for:
- Verifying configuration values
- Debugging configuration issues
- Understanding current settings
## Input/Output Options
### Custom Input File
```bash
python cli/main.py --input path/to/songs.json
```
Specifies a custom input file instead of the default `data/allSongs.json`.
### Custom Output Directory
```bash
python cli/main.py --output-dir ./custom_output
```
Saves output files to a custom directory instead of the default `data/` folder.
## Report Generation
### Detailed Reports (Always Generated)
Reports are now **automatically generated** every time you run the CLI tool. The `--save-reports` flag is kept for backward compatibility but is no longer required.
Generated reports include:
- `enhanced_summary_report.txt` - Comprehensive analysis
- `channel_optimization_report.txt` - Priority optimization suggestions
- `duplicate_pattern_report.txt` - Duplicate pattern analysis
- `actionable_insights_report.txt` - Recommendations and insights
- `detailed_duplicate_analysis.txt` - Specific songs and their duplicates
- `analysis_data.json` - Raw analysis data for further processing
- `skip_songs_detailed.json` - **Web UI data (always generated)**
## Combined Examples
### Full Analysis with Reports
```bash
python cli/main.py --verbose
```
Runs complete analysis with:
- Verbose output for detailed processing information
- **Automatic comprehensive report generation**
- Skip list creation
### Custom Configuration with Dry Run
```bash
python cli/main.py --config custom_config.json --dry-run --verbose
```
Tests a custom configuration without generating files:
- Uses custom configuration
- Shows detailed processing
- No output files created
### Custom Input/Output with Reports
```bash
python cli/main.py --input /path/to/songs.json --output-dir ./reports
```
Processes custom input and saves all outputs to reports directory:
- Custom input file
- Custom output location
- **All report files automatically generated**
### Minimal Output
```bash
python cli/main.py --output-dir ./minimal
```
Runs with minimal output:
- No verbose logging
- No detailed reports
- Only generates skip list
## Configuration File Structure
The default configuration file (`config/config.json`) contains:
```json
{
"channel_priorities": [
"Sing King Karaoke",
"KaraFun Karaoke",
"Stingray Karaoke"
],
"matching": {
"fuzzy_matching": false,
"fuzzy_threshold": 0.85,
"case_sensitive": false
},
"output": {
"verbose": false,
"include_reasons": true,
"max_duplicates_per_song": 10
},
"file_types": {
"supported_extensions": [".mp3", ".cdg", ".mp4"],
"mp4_extensions": [".mp4"]
}
}
```
### Configuration Options Explained
#### Channel Priorities
- **channel_priorities**: Array of folder names for MP4 files
- Order determines priority (first = highest priority)
- Files without matching folders are marked for manual review
#### Matching Settings
- **fuzzy_matching**: Enable/disable fuzzy string matching
- **fuzzy_threshold**: Similarity threshold (0.0-1.0) for fuzzy matching
- **case_sensitive**: Case-sensitive artist/title comparison
#### Output Settings
- **verbose**: Enable detailed output
- **include_reasons**: Include reason field in skip list
- **max_duplicates_per_song**: Maximum duplicates to process per song
#### File Type Settings
- **supported_extensions**: All supported file extensions
- **mp4_extensions**: Extensions treated as MP4 files
## Input File Format
The tool expects a JSON array of song objects:
```json
[
{
"artist": "Artist Name",
"title": "Song Title",
"path": "path/to/file.mp3"
}
]
```
Optional fields for MP4 files:
- `channel`: Channel/folder information
- ID3 tag information (artist, title, etc.)
## Output Files
### Primary Output
- **skipSongs.json**: List of file paths to skip in future imports
- Format: `[{"path": "file/path.mp3", "reason": "duplicate"}]`
### Report Files (with --save-reports)
- **enhanced_summary_report.txt**: Overall analysis and statistics
- **channel_optimization_report.txt**: Channel priority suggestions
- **duplicate_pattern_report.txt**: Duplicate detection patterns
- **actionable_insights_report.txt**: Recommendations for collection management
- **detailed_duplicate_analysis.txt**: Specific duplicate groups
- **analysis_data.json**: Raw data for further processing
- **skip_songs_detailed.json**: Complete skip list with metadata
## File Type Priority System
The tool processes files in this priority order:
1. **MP4 files** (with channel priority sorting)
2. **CDG/MP3 pairs** (treated as single units)
3. **Standalone MP3** files
4. **Standalone CDG** files
## Error Handling
The tool provides clear error messages for:
- Missing input files
- Invalid JSON format
- Configuration errors
- File permission issues
- Processing errors
## Performance Notes
- Successfully tested with 37,000+ songs
- Processes large datasets efficiently
- Shows progress indicators for long operations
- Memory-efficient processing
## Integration with Web UI
The CLI tool integrates with the web UI:
- Web UI can load CLI-generated data
- Priority preferences from web UI are used by CLI
- Shared configuration and data files
- Consistent processing logic
## Troubleshooting
### Common Issues
1. **File not found**: Check input file path and permissions
2. **JSON errors**: Validate input file format
3. **Configuration errors**: Use --show-config to verify settings
4. **Permission errors**: Check output directory permissions
### Debug Mode
```bash
python cli/main.py --verbose --dry-run --show-config
```
Complete debugging setup:
- Shows configuration
- Verbose processing
- No file changes
## Version Information
This commands reference is for Karaoke Song Library Cleanup Tool v2.0
- CLI: Fully functional with comprehensive options
- Web UI: Interactive priority management
- Priority System: Drag-and-drop with persistence
- Reports: Enhanced analysis with actionable insights