KaraokeMerge/README.md

# Karaoke Song Library Cleanup Tool

A comprehensive tool for analyzing, deduplicating, and cleaning up large karaoke song collections. The tool identifies duplicate songs across different formats and generates a "skip list" for future imports.

## Features

### Core Functionality
- **Song Deduplication**: Identifies duplicate songs based on artist + title matching
- **Multi-Format Support**: Handles MP3, CDG, and MP4 files
- **CDG/MP3 Pairing**: Treats CDG and MP3 files with the same base filename as single karaoke units
- **Channel Priority**: For MP4 files, prioritizes based on folder names in the path
- **Fuzzy Matching**: Configurable fuzzy matching for artist/title comparison

### File Type Priority System
1. **MP4 files** (with channel priority sorting)
2. **CDG/MP3 pairs** (treated as single units)
3. **Standalone MP3** files
4. **Standalone CDG** files

### Web UI Features
- **Interactive Table View**: Sortable, filterable grid of duplicate songs
- **Bulk Selection**: Select multiple items for batch operations
- **Search & Filter**: Real-time search across artists, titles, and paths
- **Responsive Design**: Mobile-friendly interface
- **Easy Startup**: Automated dependency checking and browser launch

### 🆕 Drag-and-Drop Priority Management
- **Visual Priority Reordering**: Drag and drop files within each duplicate group to change their priority
- **Persistent Preferences**: Save your priority preferences for future CLI runs
- **Priority Indicators**: Visual numbered indicators show the current priority order
- **Reset Functionality**: Easily reset to default priorities if needed

## Installation

1. Clone the repository
2. Install dependencies:
   ```bash
   pip install -r requirements.txt
   ```

## Usage

### CLI Tool

Run the main CLI tool:
```bash
python cli/main.py
```

Options:
- `--verbose`: Enable verbose output
- `--save-reports`: Generate detailed analysis reports
- `--dry-run`: Show what would be done without making changes

### Web UI

Start the web interface:
```bash
python start_web_ui.py
```

The web UI will automatically:
1. Check for required dependencies
2. Start the Flask server
3. Open your default browser to the interface

### Priority Preferences

The web UI now supports drag-and-drop priority management:

1. **Reorder Files**: Click the "Details" button for any duplicate group, then drag files to reorder them
2. **Save Preferences**: Click "Save Priority Preferences" to store your choices
3. **Apply to CLI**: Future CLI runs will automatically use your saved preferences
4. **Reset**: Use "Reset Priorities" to restore default behavior

Your preferences are saved in `data/preferences/priority_preferences.json` and will be automatically loaded by the CLI tool.

## Configuration

Edit `config/config.json` to customize:
- Channel priorities for MP4 files
- Matching settings (fuzzy matching, thresholds)
- Output options

## File Structure

```
KaraokeMerge/
├── data/
│   ├── allSongs.json          # Input: Your song library data
│   ├── skipSongs.json         # Output: Generated skip list
│   ├── preferences/           # User priority preferences
│   │   └── priority_preferences.json
│   └── reports/               # Detailed analysis reports
├── config/
│   └── config.json            # Configuration settings
├── cli/
│   ├── main.py                # Main CLI application
│   ├── matching.py            # Song matching logic
│   ├── preferences.py         # Priority preferences manager
│   ├── report.py              # Report generation
│   └── utils.py               # Utility functions
├── web/                       # Web UI for manual review
│   ├── app.py                 # Flask web application
│   └── templates/
│       └── index.html         # Web interface template
├── start_web_ui.py            # Web UI startup script
├── test_tool.py               # Validation and testing script
├── requirements.txt           # Python dependencies
├── PRD.md                     # Product Requirements Document
└── README.md                  # Project documentation
```

## Data Requirements

Place your song library data in `data/allSongs.json` with the following format:
```json
[
  {
    "artist": "Artist Name",
    "title": "Song Title",
    "path": "path/to/file.mp3"
  }
]
```

## Performance

Successfully tested with:
- 37,015 songs
- 12,424 duplicates (33.6% duplicate rate)
- 10,998 unique files after deduplication

## Contributing

This project follows strict architectural principles:
- **Separation of Concerns**: Modular design with focused responsibilities
- **Constants and Enums**: Centralized configuration
- **Readability**: Self-documenting code with clear naming
- **Extensibility**: Designed for future growth
- **Refactorability**: Minimal coupling between components