Signed-off-by: mbrucedogs <mbrucedogs@gmail.com>

This commit is contained in:
mbrucedogs 2025-07-28 14:15:51 -05:00
parent 7f20ba3ffa
commit 598798d9de
4 changed files with 275 additions and 1 deletions

13
PRD.md
View File

@ -48,16 +48,26 @@ These principles are fundamental to the project's long-term success and must be
- **PRD.md Updates:** Any changes to project requirements, architecture, or functionality must be reflected in this document - **PRD.md Updates:** Any changes to project requirements, architecture, or functionality must be reflected in this document
- **README.md Updates:** User-facing features, installation instructions, or usage changes must be documented - **README.md Updates:** User-facing features, installation instructions, or usage changes must be documented
- **CLI Commands Documentation:** All CLI functionality, options, and usage examples must be documented in `cli/commands.txt`
- **Code Comments:** Significant logic changes should include inline documentation - **Code Comments:** Significant logic changes should include inline documentation
- **API Documentation:** New endpoints, functions, or interfaces must be documented - **API Documentation:** New endpoints, functions, or interfaces must be documented
**Documentation Update Checklist:** **Documentation Update Checklist:**
- [ ] Update PRD.md with any architectural or requirement changes - [ ] Update PRD.md with any architectural or requirement changes
- [ ] Update README.md with new features, installation steps, or usage instructions - [ ] Update README.md with new features, installation steps, or usage instructions
- [ ] Update `cli/commands.txt` with any new CLI options, examples, or functionality changes
- [ ] Add inline comments for complex logic or business rules - [ ] Add inline comments for complex logic or business rules
- [ ] Update any configuration examples or file structure documentation - [ ] Update any configuration examples or file structure documentation
- [ ] Review and update implementation status sections - [ ] Review and update implementation status sections
**CLI Commands Documentation Requirements:**
- **Comprehensive Coverage:** All CLI arguments, options, and flags must be documented with examples
- **Usage Examples:** Provide practical examples for common use cases and combinations
- **Configuration Details:** Document all configuration options and their effects
- **Error Handling:** Include troubleshooting information and common issues
- **Integration Notes:** Document how CLI integrates with web UI and other components
- **Version Tracking:** Keep version information and feature status up to date
This documentation requirement is mandatory and ensures the project remains maintainable and accessible to future developers and users. This documentation requirement is mandatory and ensures the project remains maintainable and accessible to future developers and users.
### 2.3 Code Quality & Development Standards ### 2.3 Code Quality & Development Standards
@ -230,7 +240,8 @@ KaraokeMerge/
│ ├── matching.py # Song matching logic │ ├── matching.py # Song matching logic
│ ├── report.py # Report generation │ ├── report.py # Report generation
│ ├── preferences.py # Priority preferences management │ ├── preferences.py # Priority preferences management
│ └── utils.py # Utility functions │ ├── utils.py # Utility functions
│ └── commands.txt # Comprehensive CLI commands reference
├── web/ # Web UI for manual review ├── web/ # Web UI for manual review
│ ├── app.py # Flask web application │ ├── app.py # Flask web application
│ └── templates/ │ └── templates/

262
cli/commands.txt Normal file
View File

@ -0,0 +1,262 @@
# Karaoke Song Library Cleanup Tool - CLI Commands Reference
## Overview
The CLI tool analyzes karaoke song collections, identifies duplicates, and generates skip lists for future imports. It supports multiple file formats (MP3, CDG, MP4) with configurable priority systems.
## Basic Usage
### Standard Analysis
```bash
python cli/main.py
```
Runs the tool with default settings:
- Input: `data/allSongs.json`
- Config: `config/config.json`
- Output: `data/skipSongs.json`
- Verbose: Disabled
- Reports: Not saved
### Verbose Output
```bash
python cli/main.py --verbose
# or
python cli/main.py -v
```
Enables detailed output showing:
- Individual song processing
- Duplicate detection details
- File type analysis
- Channel priority decisions
### Dry Run Mode
```bash
python cli/main.py --dry-run
```
Analyzes songs without generating the skip list file. Useful for:
- Testing configuration changes
- Previewing results before committing
- Validating input data
## Configuration Options
### Custom Configuration File
```bash
python cli/main.py --config path/to/custom_config.json
```
Uses a custom configuration file instead of the default `config/config.json`.
### Show Current Configuration
```bash
python cli/main.py --show-config
```
Displays the current configuration settings and exits. Useful for:
- Verifying configuration values
- Debugging configuration issues
- Understanding current settings
## Input/Output Options
### Custom Input File
```bash
python cli/main.py --input path/to/songs.json
```
Specifies a custom input file instead of the default `data/allSongs.json`.
### Custom Output Directory
```bash
python cli/main.py --output-dir ./custom_output
```
Saves output files to a custom directory instead of the default `data/` folder.
## Report Generation
### Save Detailed Reports
```bash
python cli/main.py --save-reports
```
Generates comprehensive analysis reports in the output directory:
- `enhanced_summary_report.txt` - Comprehensive analysis
- `channel_optimization_report.txt` - Priority optimization suggestions
- `duplicate_pattern_report.txt` - Duplicate pattern analysis
- `actionable_insights_report.txt` - Recommendations and insights
- `detailed_duplicate_analysis.txt` - Specific songs and their duplicates
- `analysis_data.json` - Raw analysis data for further processing
- `skip_songs_detailed.json` - Full skip list with metadata
## Combined Examples
### Full Analysis with Reports
```bash
python cli/main.py --verbose --save-reports
```
Runs complete analysis with:
- Verbose output for detailed processing information
- Comprehensive report generation
- Skip list creation
### Custom Configuration with Dry Run
```bash
python cli/main.py --config custom_config.json --dry-run --verbose
```
Tests a custom configuration without generating files:
- Uses custom configuration
- Shows detailed processing
- No output files created
### Custom Input/Output with Reports
```bash
python cli/main.py --input /path/to/songs.json --output-dir ./reports --save-reports
```
Processes custom input and saves all outputs to reports directory:
- Custom input file
- Custom output location
- All report files generated
### Minimal Output
```bash
python cli/main.py --output-dir ./minimal
```
Runs with minimal output:
- No verbose logging
- No detailed reports
- Only generates skip list
## Configuration File Structure
The default configuration file (`config/config.json`) contains:
```json
{
"channel_priorities": [
"Sing King Karaoke",
"KaraFun Karaoke",
"Stingray Karaoke"
],
"matching": {
"fuzzy_matching": false,
"fuzzy_threshold": 0.85,
"case_sensitive": false
},
"output": {
"verbose": false,
"include_reasons": true,
"max_duplicates_per_song": 10
},
"file_types": {
"supported_extensions": [".mp3", ".cdg", ".mp4"],
"mp4_extensions": [".mp4"]
}
}
```
### Configuration Options Explained
#### Channel Priorities
- **channel_priorities**: Array of folder names for MP4 files
- Order determines priority (first = highest priority)
- Files without matching folders are marked for manual review
#### Matching Settings
- **fuzzy_matching**: Enable/disable fuzzy string matching
- **fuzzy_threshold**: Similarity threshold (0.0-1.0) for fuzzy matching
- **case_sensitive**: Case-sensitive artist/title comparison
#### Output Settings
- **verbose**: Enable detailed output
- **include_reasons**: Include reason field in skip list
- **max_duplicates_per_song**: Maximum duplicates to process per song
#### File Type Settings
- **supported_extensions**: All supported file extensions
- **mp4_extensions**: Extensions treated as MP4 files
## Input File Format
The tool expects a JSON array of song objects:
```json
[
{
"artist": "Artist Name",
"title": "Song Title",
"path": "path/to/file.mp3"
}
]
```
Optional fields for MP4 files:
- `channel`: Channel/folder information
- ID3 tag information (artist, title, etc.)
## Output Files
### Primary Output
- **skipSongs.json**: List of file paths to skip in future imports
- Format: `[{"path": "file/path.mp3", "reason": "duplicate"}]`
### Report Files (with --save-reports)
- **enhanced_summary_report.txt**: Overall analysis and statistics
- **channel_optimization_report.txt**: Channel priority suggestions
- **duplicate_pattern_report.txt**: Duplicate detection patterns
- **actionable_insights_report.txt**: Recommendations for collection management
- **detailed_duplicate_analysis.txt**: Specific duplicate groups
- **analysis_data.json**: Raw data for further processing
- **skip_songs_detailed.json**: Complete skip list with metadata
## File Type Priority System
The tool processes files in this priority order:
1. **MP4 files** (with channel priority sorting)
2. **CDG/MP3 pairs** (treated as single units)
3. **Standalone MP3** files
4. **Standalone CDG** files
## Error Handling
The tool provides clear error messages for:
- Missing input files
- Invalid JSON format
- Configuration errors
- File permission issues
- Processing errors
## Performance Notes
- Successfully tested with 37,000+ songs
- Processes large datasets efficiently
- Shows progress indicators for long operations
- Memory-efficient processing
## Integration with Web UI
The CLI tool integrates with the web UI:
- Web UI can load CLI-generated data
- Priority preferences from web UI are used by CLI
- Shared configuration and data files
- Consistent processing logic
## Troubleshooting
### Common Issues
1. **File not found**: Check input file path and permissions
2. **JSON errors**: Validate input file format
3. **Configuration errors**: Use --show-config to verify settings
4. **Permission errors**: Check output directory permissions
### Debug Mode
```bash
python cli/main.py --verbose --dry-run --show-config
```
Complete debugging setup:
- Shows configuration
- Verbose processing
- No file changes
## Version Information
This commands reference is for Karaoke Song Library Cleanup Tool v2.0
- CLI: Fully functional with comprehensive options
- Web UI: Interactive priority management
- Priority System: Drag-and-drop with persistence
- Reports: Enhanced analysis with actionable insights

View File

@ -123,6 +123,7 @@ def main():
songs = load_songs(args.input) songs = load_songs(args.input)
# Initialize components # Initialize components
data_dir = args.output_dir
matcher = SongMatcher(config, data_dir) matcher = SongMatcher(config, data_dir)
reporter = ReportGenerator(config) reporter = ReportGenerator(config)