# Karaoke Song Library Cleanup Tool - CLI Commands Reference ## Overview The CLI tool analyzes karaoke song collections, identifies duplicates, and generates skip lists for future imports. It supports multiple file formats (MP3, CDG, MP4) with configurable priority systems. ## Basic Usage ### Standard Analysis ```bash python cli/main.py ``` Runs the tool with default settings: - Input: `data/allSongs.json` - Config: `config/config.json` - Output: `data/skipSongs.json` - Verbose: Disabled - Reports: Not saved ### Verbose Output ```bash python cli/main.py --verbose # or python cli/main.py -v ``` Enables detailed output showing: - Individual song processing - Duplicate detection details - File type analysis - Channel priority decisions ### Dry Run Mode ```bash python cli/main.py --dry-run ``` Analyzes songs without generating the skip list file. Useful for: - Testing configuration changes - Previewing results before committing - Validating input data ## Configuration Options ### Custom Configuration File ```bash python cli/main.py --config path/to/custom_config.json ``` Uses a custom configuration file instead of the default `config/config.json`. ### Show Current Configuration ```bash python cli/main.py --show-config ``` Displays the current configuration settings and exits. Useful for: - Verifying configuration values - Debugging configuration issues - Understanding current settings ## Input/Output Options ### Custom Input File ```bash python cli/main.py --input path/to/songs.json ``` Specifies a custom input file instead of the default `data/allSongs.json`. ### Custom Output Directory ```bash python cli/main.py --output-dir ./custom_output ``` Saves output files to a custom directory instead of the default `data/` folder. ## Report Generation ### Save Detailed Reports ```bash python cli/main.py --save-reports ``` Generates comprehensive analysis reports in the output directory: - `enhanced_summary_report.txt` - Comprehensive analysis - `channel_optimization_report.txt` - Priority optimization suggestions - `duplicate_pattern_report.txt` - Duplicate pattern analysis - `actionable_insights_report.txt` - Recommendations and insights - `detailed_duplicate_analysis.txt` - Specific songs and their duplicates - `analysis_data.json` - Raw analysis data for further processing - `skip_songs_detailed.json` - Full skip list with metadata ## Combined Examples ### Full Analysis with Reports ```bash python cli/main.py --verbose --save-reports ``` Runs complete analysis with: - Verbose output for detailed processing information - Comprehensive report generation - Skip list creation ### Custom Configuration with Dry Run ```bash python cli/main.py --config custom_config.json --dry-run --verbose ``` Tests a custom configuration without generating files: - Uses custom configuration - Shows detailed processing - No output files created ### Custom Input/Output with Reports ```bash python cli/main.py --input /path/to/songs.json --output-dir ./reports --save-reports ``` Processes custom input and saves all outputs to reports directory: - Custom input file - Custom output location - All report files generated ### Minimal Output ```bash python cli/main.py --output-dir ./minimal ``` Runs with minimal output: - No verbose logging - No detailed reports - Only generates skip list ## Configuration File Structure The default configuration file (`config/config.json`) contains: ```json { "channel_priorities": [ "Sing King Karaoke", "KaraFun Karaoke", "Stingray Karaoke" ], "matching": { "fuzzy_matching": false, "fuzzy_threshold": 0.85, "case_sensitive": false }, "output": { "verbose": false, "include_reasons": true, "max_duplicates_per_song": 10 }, "file_types": { "supported_extensions": [".mp3", ".cdg", ".mp4"], "mp4_extensions": [".mp4"] } } ``` ### Configuration Options Explained #### Channel Priorities - **channel_priorities**: Array of folder names for MP4 files - Order determines priority (first = highest priority) - Files without matching folders are marked for manual review #### Matching Settings - **fuzzy_matching**: Enable/disable fuzzy string matching - **fuzzy_threshold**: Similarity threshold (0.0-1.0) for fuzzy matching - **case_sensitive**: Case-sensitive artist/title comparison #### Output Settings - **verbose**: Enable detailed output - **include_reasons**: Include reason field in skip list - **max_duplicates_per_song**: Maximum duplicates to process per song #### File Type Settings - **supported_extensions**: All supported file extensions - **mp4_extensions**: Extensions treated as MP4 files ## Input File Format The tool expects a JSON array of song objects: ```json [ { "artist": "Artist Name", "title": "Song Title", "path": "path/to/file.mp3" } ] ``` Optional fields for MP4 files: - `channel`: Channel/folder information - ID3 tag information (artist, title, etc.) ## Output Files ### Primary Output - **skipSongs.json**: List of file paths to skip in future imports - Format: `[{"path": "file/path.mp3", "reason": "duplicate"}]` ### Report Files (with --save-reports) - **enhanced_summary_report.txt**: Overall analysis and statistics - **channel_optimization_report.txt**: Channel priority suggestions - **duplicate_pattern_report.txt**: Duplicate detection patterns - **actionable_insights_report.txt**: Recommendations for collection management - **detailed_duplicate_analysis.txt**: Specific duplicate groups - **analysis_data.json**: Raw data for further processing - **skip_songs_detailed.json**: Complete skip list with metadata ## File Type Priority System The tool processes files in this priority order: 1. **MP4 files** (with channel priority sorting) 2. **CDG/MP3 pairs** (treated as single units) 3. **Standalone MP3** files 4. **Standalone CDG** files ## Error Handling The tool provides clear error messages for: - Missing input files - Invalid JSON format - Configuration errors - File permission issues - Processing errors ## Performance Notes - Successfully tested with 37,000+ songs - Processes large datasets efficiently - Shows progress indicators for long operations - Memory-efficient processing ## Integration with Web UI The CLI tool integrates with the web UI: - Web UI can load CLI-generated data - Priority preferences from web UI are used by CLI - Shared configuration and data files - Consistent processing logic ## Troubleshooting ### Common Issues 1. **File not found**: Check input file path and permissions 2. **JSON errors**: Validate input file format 3. **Configuration errors**: Use --show-config to verify settings 4. **Permission errors**: Check output directory permissions ### Debug Mode ```bash python cli/main.py --verbose --dry-run --show-config ``` Complete debugging setup: - Shows configuration - Verbose processing - No file changes ## Version Information This commands reference is for Karaoke Song Library Cleanup Tool v2.0 - CLI: Fully functional with comprehensive options - Web UI: Interactive priority management - Priority System: Drag-and-drop with persistence - Reports: Enhanced analysis with actionable insights