musicbrainz-cleaner/COMMANDS.md

360 lines
8.0 KiB
Markdown

# MusicBrainz Data Cleaner - CLI Commands Reference
## Overview
The MusicBrainz Data Cleaner is a command-line interface (CLI) tool that processes JSON song data files and cleans/normalizes the metadata using the MusicBrainz database.
## Basic Command Structure
```bash
python musicbrainz_cleaner.py <input_file> [output_file] [options]
```
## Command Arguments
### Required Arguments
| Argument | Type | Description | Example |
|----------|------|-------------|---------|
| `input_file` | string | Path to the JSON file containing song data | `my_songs.json` |
### Optional Arguments
| Argument | Type | Description | Example |
|----------|------|-------------|---------|
| `output_file` | string | Path for the cleaned output file | `cleaned_songs.json` |
| `--help` | flag | Show help information | `--help` |
| `--version` | flag | Show version information | `--version` |
## Command Examples
### Basic Usage
```bash
# Clean songs and save to auto-generated filename
python musicbrainz_cleaner.py songs.json
# Output: songs_cleaned.json
```
### Custom Output File
```bash
# Specify custom output filename
python musicbrainz_cleaner.py songs.json cleaned_songs.json
```
### Help and Information
```bash
# Show help information
python musicbrainz_cleaner.py --help
# Show version information
python musicbrainz_cleaner.py --version
```
## Input File Format
The input file must be a valid JSON file containing an array of song objects:
```json
[
{
"artist": "ACDC",
"title": "Shot In The Dark",
"disabled": false,
"favorite": true,
"guid": "8946008c-7acc-d187-60e6-5286e55ad502",
"path": "z://MP4\\ACDC - Shot In The Dark (Karaoke Version).mp4"
}
]
```
### Required Fields
- `artist`: The artist name (string)
- `title`: The song title (string)
### Optional Fields
Any additional fields will be preserved in the output:
- `disabled`: Boolean flag
- `favorite`: Boolean flag
- `guid`: Unique identifier
- `path`: File path
- Any other custom fields
## Output File Format
The output file will contain the same structure with cleaned data and added MBID fields:
```json
[
{
"artist": "AC/DC",
"title": "Shot in the Dark",
"disabled": false,
"favorite": true,
"guid": "8946008c-7acc-d187-60e6-5286e55ad502",
"path": "z://MP4\\ACDC - Shot In The Dark (Karaoke Version).mp4",
"mbid": "66c662b6-6e2f-4930-8610-912e24c63ed1",
"recording_mbid": "cf8b5cd0-d97c-413d-882f-fc422a2e57db"
}
]
```
### Added Fields
- `mbid`: MusicBrainz Artist ID (string)
- `recording_mbid`: MusicBrainz Recording ID (string)
## Command Line Options
### Help Option
```bash
python musicbrainz_cleaner.py --help
```
**Output:**
```
Usage: python musicbrainz_cleaner.py <input_file.json> [output_file.json]
MusicBrainz Data Cleaner - Clean and normalize song data using MusicBrainz
Arguments:
input_file.json JSON file containing array of song objects
output_file.json Optional: Output file for cleaned data
(default: input_file_cleaned.json)
Examples:
python musicbrainz_cleaner.py songs.json
python musicbrainz_cleaner.py songs.json cleaned_songs.json
Requirements:
- MusicBrainz server running on http://localhost:5001
- Python 3.6+ with requests library
```
### Version Option
```bash
python musicbrainz_cleaner.py --version
```
**Output:**
```
MusicBrainz Data Cleaner v1.0.0
```
## Error Messages and Exit Codes
### Exit Codes
| Code | Meaning | Description |
|------|---------|-------------|
| 0 | Success | Processing completed successfully |
| 1 | Error | General error occurred |
| 2 | Usage Error | Invalid command line arguments |
### Common Error Messages
#### File Not Found
```
Error: File 'songs.json' not found
```
#### Invalid JSON
```
Error: Invalid JSON in file 'songs.json'
```
#### Invalid Input Format
```
Error: Input file should contain a JSON array of songs
```
#### Connection Error
```
Error searching for artist 'Artist Name': Connection refused
```
#### Missing Dependencies
```
ModuleNotFoundError: No module named 'requests'
```
## Processing Output
### Progress Indicators
```
Processing 3 songs...
==================================================
[1/3] Processing: ACDC - Shot In The Dark
✅ Found artist: AC/DC (MBID: 66c662b6-6e2f-4930-8610-912e24c63ed1)
✅ Found recording: Shot in the Dark (MBID: cf8b5cd0-d97c-413d-882f-fc422a2e57db)
✅ Updated to: AC/DC - Shot in the Dark
[2/3] Processing: Bruno Mars ft. Cardi B - Finesse Remix
❌ Could not find artist: Bruno Mars ft. Cardi B
[3/3] Processing: Taylor Swift - Love Story
✅ Found artist: Taylor Swift (MBID: 20244d07-534f-4eff-b4d4-930878889970)
✅ Found recording: Love Story (MBID: d783e6c5-761f-4fc3-bfcf-6089cdfc8f96)
✅ Updated to: Taylor Swift - Love Story
==================================================
✅ Processing complete!
📁 Output saved to: songs_cleaned.json
```
### Status Indicators
| Symbol | Meaning | Description |
|--------|---------|-------------|
| ✅ | Success | Operation completed successfully |
| ❌ | Error | Operation failed |
| 🔄 | Processing | Currently processing |
## Batch Processing
### Multiple Files
To process multiple files, you can use shell scripting:
```bash
# Process all JSON files in current directory
for file in *.json; do
python musicbrainz_cleaner.py "$file"
done
```
### Large Files
For large files, the tool processes songs one at a time with a 0.1-second delay between API calls to be respectful to the MusicBrainz server.
## Environment Variables
The tool uses the following default configuration:
| Setting | Default | Description |
|---------|---------|-------------|
| MusicBrainz URL | `http://localhost:5001` | Local MusicBrainz server URL |
| API Delay | `0.1` seconds | Delay between API calls |
## Troubleshooting Commands
### Check MusicBrainz Server Status
```bash
# Test if server is running
curl -I http://localhost:5001
# Test API endpoint
curl http://localhost:5001/ws/2/artist/?query=name:AC/DC&fmt=json
```
### Validate JSON File
```bash
# Check if JSON is valid
python -m json.tool songs.json
# Check JSON structure
python -c "import json; data=json.load(open('songs.json')); print('Valid JSON array with', len(data), 'items')"
```
### Check Python Dependencies
```bash
# Check if requests is installed
python -c "import requests; print('requests version:', requests.__version__)"
# Install if missing
pip install requests
```
## Advanced Usage
### Custom MusicBrainz Server
To use a different MusicBrainz server, modify the script:
```python
# In musicbrainz_cleaner.py, change:
self.base_url = "http://your-server:5001"
```
### Verbose Output
For debugging, you can modify the script to add more verbose output by uncommenting debug print statements.
## Command Line Shortcuts
### Common Aliases
Add these to your shell profile for convenience:
```bash
# Add to ~/.bashrc or ~/.zshrc
alias mbclean='python musicbrainz_cleaner.py'
alias mbclean-help='python musicbrainz_cleaner.py --help'
```
### Usage with Aliases
```bash
# Using alias
mbclean songs.json
# Show help
mbclean-help
```
## Integration Examples
### With Git
```bash
# Process files and commit changes
python musicbrainz_cleaner.py songs.json
git add songs_cleaned.json
git commit -m "Clean song metadata with MusicBrainz IDs"
```
### With Cron Jobs
```bash
# Add to crontab to process files daily
0 2 * * * cd /path/to/musicbrainz-cleaner && python musicbrainz_cleaner.py /path/to/songs.json
```
### With Shell Scripts
```bash
#!/bin/bash
# clean_songs.sh
INPUT_FILE="$1"
OUTPUT_FILE="${INPUT_FILE%.json}_cleaned.json"
python musicbrainz_cleaner.py "$INPUT_FILE" "$OUTPUT_FILE"
if [ $? -eq 0 ]; then
echo "Successfully cleaned $INPUT_FILE"
echo "Output saved to $OUTPUT_FILE"
else
echo "Error processing $INPUT_FILE"
exit 1
fi
```
## Command Reference Summary
| Command | Description |
|---------|-------------|
| `python musicbrainz_cleaner.py file.json` | Basic usage |
| `python musicbrainz_cleaner.py file.json output.json` | Custom output |
| `python musicbrainz_cleaner.py --help` | Show help |
| `python musicbrainz_cleaner.py --version` | Show version |