diff --git a/PRD.md b/PRD.md index 4934a4a..d233539 100644 --- a/PRD.md +++ b/PRD.md @@ -48,16 +48,26 @@ These principles are fundamental to the project's long-term success and must be - **PRD.md Updates:** Any changes to project requirements, architecture, or functionality must be reflected in this document - **README.md Updates:** User-facing features, installation instructions, or usage changes must be documented +- **CLI Commands Documentation:** All CLI functionality, options, and usage examples must be documented in `cli/commands.txt` - **Code Comments:** Significant logic changes should include inline documentation - **API Documentation:** New endpoints, functions, or interfaces must be documented **Documentation Update Checklist:** - [ ] Update PRD.md with any architectural or requirement changes - [ ] Update README.md with new features, installation steps, or usage instructions +- [ ] Update `cli/commands.txt` with any new CLI options, examples, or functionality changes - [ ] Add inline comments for complex logic or business rules - [ ] Update any configuration examples or file structure documentation - [ ] Review and update implementation status sections +**CLI Commands Documentation Requirements:** +- **Comprehensive Coverage:** All CLI arguments, options, and flags must be documented with examples +- **Usage Examples:** Provide practical examples for common use cases and combinations +- **Configuration Details:** Document all configuration options and their effects +- **Error Handling:** Include troubleshooting information and common issues +- **Integration Notes:** Document how CLI integrates with web UI and other components +- **Version Tracking:** Keep version information and feature status up to date + This documentation requirement is mandatory and ensures the project remains maintainable and accessible to future developers and users. ### 2.3 Code Quality & Development Standards @@ -230,7 +240,8 @@ KaraokeMerge/ │ ├── matching.py # Song matching logic │ ├── report.py # Report generation │ ├── preferences.py # Priority preferences management -│ └── utils.py # Utility functions +│ ├── utils.py # Utility functions +│ └── commands.txt # Comprehensive CLI commands reference ├── web/ # Web UI for manual review │ ├── app.py # Flask web application │ └── templates/ diff --git a/cli/__pycache__/matching.cpython-313.pyc b/cli/__pycache__/matching.cpython-313.pyc index 288c541..314acf2 100644 Binary files a/cli/__pycache__/matching.cpython-313.pyc and b/cli/__pycache__/matching.cpython-313.pyc differ diff --git a/cli/commands.txt b/cli/commands.txt new file mode 100644 index 0000000..b534e44 --- /dev/null +++ b/cli/commands.txt @@ -0,0 +1,262 @@ +# Karaoke Song Library Cleanup Tool - CLI Commands Reference + +## Overview +The CLI tool analyzes karaoke song collections, identifies duplicates, and generates skip lists for future imports. It supports multiple file formats (MP3, CDG, MP4) with configurable priority systems. + +## Basic Usage + +### Standard Analysis +```bash +python cli/main.py +``` +Runs the tool with default settings: +- Input: `data/allSongs.json` +- Config: `config/config.json` +- Output: `data/skipSongs.json` +- Verbose: Disabled +- Reports: Not saved + +### Verbose Output +```bash +python cli/main.py --verbose +# or +python cli/main.py -v +``` +Enables detailed output showing: +- Individual song processing +- Duplicate detection details +- File type analysis +- Channel priority decisions + +### Dry Run Mode +```bash +python cli/main.py --dry-run +``` +Analyzes songs without generating the skip list file. Useful for: +- Testing configuration changes +- Previewing results before committing +- Validating input data + +## Configuration Options + +### Custom Configuration File +```bash +python cli/main.py --config path/to/custom_config.json +``` +Uses a custom configuration file instead of the default `config/config.json`. + +### Show Current Configuration +```bash +python cli/main.py --show-config +``` +Displays the current configuration settings and exits. Useful for: +- Verifying configuration values +- Debugging configuration issues +- Understanding current settings + +## Input/Output Options + +### Custom Input File +```bash +python cli/main.py --input path/to/songs.json +``` +Specifies a custom input file instead of the default `data/allSongs.json`. + +### Custom Output Directory +```bash +python cli/main.py --output-dir ./custom_output +``` +Saves output files to a custom directory instead of the default `data/` folder. + +## Report Generation + +### Save Detailed Reports +```bash +python cli/main.py --save-reports +``` +Generates comprehensive analysis reports in the output directory: +- `enhanced_summary_report.txt` - Comprehensive analysis +- `channel_optimization_report.txt` - Priority optimization suggestions +- `duplicate_pattern_report.txt` - Duplicate pattern analysis +- `actionable_insights_report.txt` - Recommendations and insights +- `detailed_duplicate_analysis.txt` - Specific songs and their duplicates +- `analysis_data.json` - Raw analysis data for further processing +- `skip_songs_detailed.json` - Full skip list with metadata + +## Combined Examples + +### Full Analysis with Reports +```bash +python cli/main.py --verbose --save-reports +``` +Runs complete analysis with: +- Verbose output for detailed processing information +- Comprehensive report generation +- Skip list creation + +### Custom Configuration with Dry Run +```bash +python cli/main.py --config custom_config.json --dry-run --verbose +``` +Tests a custom configuration without generating files: +- Uses custom configuration +- Shows detailed processing +- No output files created + +### Custom Input/Output with Reports +```bash +python cli/main.py --input /path/to/songs.json --output-dir ./reports --save-reports +``` +Processes custom input and saves all outputs to reports directory: +- Custom input file +- Custom output location +- All report files generated + +### Minimal Output +```bash +python cli/main.py --output-dir ./minimal +``` +Runs with minimal output: +- No verbose logging +- No detailed reports +- Only generates skip list + +## Configuration File Structure + +The default configuration file (`config/config.json`) contains: + +```json +{ + "channel_priorities": [ + "Sing King Karaoke", + "KaraFun Karaoke", + "Stingray Karaoke" + ], + "matching": { + "fuzzy_matching": false, + "fuzzy_threshold": 0.85, + "case_sensitive": false + }, + "output": { + "verbose": false, + "include_reasons": true, + "max_duplicates_per_song": 10 + }, + "file_types": { + "supported_extensions": [".mp3", ".cdg", ".mp4"], + "mp4_extensions": [".mp4"] + } +} +``` + +### Configuration Options Explained + +#### Channel Priorities +- **channel_priorities**: Array of folder names for MP4 files +- Order determines priority (first = highest priority) +- Files without matching folders are marked for manual review + +#### Matching Settings +- **fuzzy_matching**: Enable/disable fuzzy string matching +- **fuzzy_threshold**: Similarity threshold (0.0-1.0) for fuzzy matching +- **case_sensitive**: Case-sensitive artist/title comparison + +#### Output Settings +- **verbose**: Enable detailed output +- **include_reasons**: Include reason field in skip list +- **max_duplicates_per_song**: Maximum duplicates to process per song + +#### File Type Settings +- **supported_extensions**: All supported file extensions +- **mp4_extensions**: Extensions treated as MP4 files + +## Input File Format + +The tool expects a JSON array of song objects: + +```json +[ + { + "artist": "Artist Name", + "title": "Song Title", + "path": "path/to/file.mp3" + } +] +``` + +Optional fields for MP4 files: +- `channel`: Channel/folder information +- ID3 tag information (artist, title, etc.) + +## Output Files + +### Primary Output +- **skipSongs.json**: List of file paths to skip in future imports +- Format: `[{"path": "file/path.mp3", "reason": "duplicate"}]` + +### Report Files (with --save-reports) +- **enhanced_summary_report.txt**: Overall analysis and statistics +- **channel_optimization_report.txt**: Channel priority suggestions +- **duplicate_pattern_report.txt**: Duplicate detection patterns +- **actionable_insights_report.txt**: Recommendations for collection management +- **detailed_duplicate_analysis.txt**: Specific duplicate groups +- **analysis_data.json**: Raw data for further processing +- **skip_songs_detailed.json**: Complete skip list with metadata + +## File Type Priority System + +The tool processes files in this priority order: + +1. **MP4 files** (with channel priority sorting) +2. **CDG/MP3 pairs** (treated as single units) +3. **Standalone MP3** files +4. **Standalone CDG** files + +## Error Handling + +The tool provides clear error messages for: +- Missing input files +- Invalid JSON format +- Configuration errors +- File permission issues +- Processing errors + +## Performance Notes + +- Successfully tested with 37,000+ songs +- Processes large datasets efficiently +- Shows progress indicators for long operations +- Memory-efficient processing + +## Integration with Web UI + +The CLI tool integrates with the web UI: +- Web UI can load CLI-generated data +- Priority preferences from web UI are used by CLI +- Shared configuration and data files +- Consistent processing logic + +## Troubleshooting + +### Common Issues +1. **File not found**: Check input file path and permissions +2. **JSON errors**: Validate input file format +3. **Configuration errors**: Use --show-config to verify settings +4. **Permission errors**: Check output directory permissions + +### Debug Mode +```bash +python cli/main.py --verbose --dry-run --show-config +``` +Complete debugging setup: +- Shows configuration +- Verbose processing +- No file changes + +## Version Information + +This commands reference is for Karaoke Song Library Cleanup Tool v2.0 +- CLI: Fully functional with comprehensive options +- Web UI: Interactive priority management +- Priority System: Drag-and-drop with persistence +- Reports: Enhanced analysis with actionable insights \ No newline at end of file diff --git a/cli/main.py b/cli/main.py index 491858c..c3c2933 100644 --- a/cli/main.py +++ b/cli/main.py @@ -123,6 +123,7 @@ def main(): songs = load_songs(args.input) # Initialize components + data_dir = args.output_dir matcher = SongMatcher(config, data_dir) reporter = ReportGenerator(config)