360 lines
8.0 KiB
Markdown
360 lines
8.0 KiB
Markdown
# MusicBrainz Data Cleaner - CLI Commands Reference
|
|
|
|
## Overview
|
|
|
|
The MusicBrainz Data Cleaner is a command-line interface (CLI) tool that processes JSON song data files and cleans/normalizes the metadata using the MusicBrainz database.
|
|
|
|
## Basic Command Structure
|
|
|
|
```bash
|
|
python musicbrainz_cleaner.py <input_file> [output_file] [options]
|
|
```
|
|
|
|
## Command Arguments
|
|
|
|
### Required Arguments
|
|
|
|
| Argument | Type | Description | Example |
|
|
|----------|------|-------------|---------|
|
|
| `input_file` | string | Path to the JSON file containing song data | `my_songs.json` |
|
|
|
|
### Optional Arguments
|
|
|
|
| Argument | Type | Description | Example |
|
|
|----------|------|-------------|---------|
|
|
| `output_file` | string | Path for the cleaned output file | `cleaned_songs.json` |
|
|
| `--help` | flag | Show help information | `--help` |
|
|
| `--version` | flag | Show version information | `--version` |
|
|
|
|
## Command Examples
|
|
|
|
### Basic Usage
|
|
|
|
```bash
|
|
# Clean songs and save to auto-generated filename
|
|
python musicbrainz_cleaner.py songs.json
|
|
# Output: songs_cleaned.json
|
|
```
|
|
|
|
### Custom Output File
|
|
|
|
```bash
|
|
# Specify custom output filename
|
|
python musicbrainz_cleaner.py songs.json cleaned_songs.json
|
|
```
|
|
|
|
### Help and Information
|
|
|
|
```bash
|
|
# Show help information
|
|
python musicbrainz_cleaner.py --help
|
|
|
|
# Show version information
|
|
python musicbrainz_cleaner.py --version
|
|
```
|
|
|
|
## Input File Format
|
|
|
|
The input file must be a valid JSON file containing an array of song objects:
|
|
|
|
```json
|
|
[
|
|
{
|
|
"artist": "ACDC",
|
|
"title": "Shot In The Dark",
|
|
"disabled": false,
|
|
"favorite": true,
|
|
"guid": "8946008c-7acc-d187-60e6-5286e55ad502",
|
|
"path": "z://MP4\\ACDC - Shot In The Dark (Karaoke Version).mp4"
|
|
}
|
|
]
|
|
```
|
|
|
|
### Required Fields
|
|
|
|
- `artist`: The artist name (string)
|
|
- `title`: The song title (string)
|
|
|
|
### Optional Fields
|
|
|
|
Any additional fields will be preserved in the output:
|
|
- `disabled`: Boolean flag
|
|
- `favorite`: Boolean flag
|
|
- `guid`: Unique identifier
|
|
- `path`: File path
|
|
- Any other custom fields
|
|
|
|
## Output File Format
|
|
|
|
The output file will contain the same structure with cleaned data and added MBID fields:
|
|
|
|
```json
|
|
[
|
|
{
|
|
"artist": "AC/DC",
|
|
"title": "Shot in the Dark",
|
|
"disabled": false,
|
|
"favorite": true,
|
|
"guid": "8946008c-7acc-d187-60e6-5286e55ad502",
|
|
"path": "z://MP4\\ACDC - Shot In The Dark (Karaoke Version).mp4",
|
|
"mbid": "66c662b6-6e2f-4930-8610-912e24c63ed1",
|
|
"recording_mbid": "cf8b5cd0-d97c-413d-882f-fc422a2e57db"
|
|
}
|
|
]
|
|
```
|
|
|
|
### Added Fields
|
|
|
|
- `mbid`: MusicBrainz Artist ID (string)
|
|
- `recording_mbid`: MusicBrainz Recording ID (string)
|
|
|
|
## Command Line Options
|
|
|
|
### Help Option
|
|
|
|
```bash
|
|
python musicbrainz_cleaner.py --help
|
|
```
|
|
|
|
**Output:**
|
|
```
|
|
Usage: python musicbrainz_cleaner.py <input_file.json> [output_file.json]
|
|
|
|
MusicBrainz Data Cleaner - Clean and normalize song data using MusicBrainz
|
|
|
|
Arguments:
|
|
input_file.json JSON file containing array of song objects
|
|
output_file.json Optional: Output file for cleaned data
|
|
(default: input_file_cleaned.json)
|
|
|
|
Examples:
|
|
python musicbrainz_cleaner.py songs.json
|
|
python musicbrainz_cleaner.py songs.json cleaned_songs.json
|
|
|
|
Requirements:
|
|
- MusicBrainz server running on http://localhost:5001
|
|
- Python 3.6+ with requests library
|
|
```
|
|
|
|
### Version Option
|
|
|
|
```bash
|
|
python musicbrainz_cleaner.py --version
|
|
```
|
|
|
|
**Output:**
|
|
```
|
|
MusicBrainz Data Cleaner v1.0.0
|
|
```
|
|
|
|
## Error Messages and Exit Codes
|
|
|
|
### Exit Codes
|
|
|
|
| Code | Meaning | Description |
|
|
|------|---------|-------------|
|
|
| 0 | Success | Processing completed successfully |
|
|
| 1 | Error | General error occurred |
|
|
| 2 | Usage Error | Invalid command line arguments |
|
|
|
|
### Common Error Messages
|
|
|
|
#### File Not Found
|
|
```
|
|
Error: File 'songs.json' not found
|
|
```
|
|
|
|
#### Invalid JSON
|
|
```
|
|
Error: Invalid JSON in file 'songs.json'
|
|
```
|
|
|
|
#### Invalid Input Format
|
|
```
|
|
Error: Input file should contain a JSON array of songs
|
|
```
|
|
|
|
#### Connection Error
|
|
```
|
|
Error searching for artist 'Artist Name': Connection refused
|
|
```
|
|
|
|
#### Missing Dependencies
|
|
```
|
|
ModuleNotFoundError: No module named 'requests'
|
|
```
|
|
|
|
## Processing Output
|
|
|
|
### Progress Indicators
|
|
|
|
```
|
|
Processing 3 songs...
|
|
==================================================
|
|
|
|
[1/3] Processing: ACDC - Shot In The Dark
|
|
✅ Found artist: AC/DC (MBID: 66c662b6-6e2f-4930-8610-912e24c63ed1)
|
|
✅ Found recording: Shot in the Dark (MBID: cf8b5cd0-d97c-413d-882f-fc422a2e57db)
|
|
✅ Updated to: AC/DC - Shot in the Dark
|
|
|
|
[2/3] Processing: Bruno Mars ft. Cardi B - Finesse Remix
|
|
❌ Could not find artist: Bruno Mars ft. Cardi B
|
|
|
|
[3/3] Processing: Taylor Swift - Love Story
|
|
✅ Found artist: Taylor Swift (MBID: 20244d07-534f-4eff-b4d4-930878889970)
|
|
✅ Found recording: Love Story (MBID: d783e6c5-761f-4fc3-bfcf-6089cdfc8f96)
|
|
✅ Updated to: Taylor Swift - Love Story
|
|
|
|
==================================================
|
|
✅ Processing complete!
|
|
📁 Output saved to: songs_cleaned.json
|
|
```
|
|
|
|
### Status Indicators
|
|
|
|
| Symbol | Meaning | Description |
|
|
|--------|---------|-------------|
|
|
| ✅ | Success | Operation completed successfully |
|
|
| ❌ | Error | Operation failed |
|
|
| 🔄 | Processing | Currently processing |
|
|
|
|
## Batch Processing
|
|
|
|
### Multiple Files
|
|
|
|
To process multiple files, you can use shell scripting:
|
|
|
|
```bash
|
|
# Process all JSON files in current directory
|
|
for file in *.json; do
|
|
python musicbrainz_cleaner.py "$file"
|
|
done
|
|
```
|
|
|
|
### Large Files
|
|
|
|
For large files, the tool processes songs one at a time with a 0.1-second delay between API calls to be respectful to the MusicBrainz server.
|
|
|
|
## Environment Variables
|
|
|
|
The tool uses the following default configuration:
|
|
|
|
| Setting | Default | Description |
|
|
|---------|---------|-------------|
|
|
| MusicBrainz URL | `http://localhost:5001` | Local MusicBrainz server URL |
|
|
| API Delay | `0.1` seconds | Delay between API calls |
|
|
|
|
## Troubleshooting Commands
|
|
|
|
### Check MusicBrainz Server Status
|
|
|
|
```bash
|
|
# Test if server is running
|
|
curl -I http://localhost:5001
|
|
|
|
# Test API endpoint
|
|
curl http://localhost:5001/ws/2/artist/?query=name:AC/DC&fmt=json
|
|
```
|
|
|
|
### Validate JSON File
|
|
|
|
```bash
|
|
# Check if JSON is valid
|
|
python -m json.tool songs.json
|
|
|
|
# Check JSON structure
|
|
python -c "import json; data=json.load(open('songs.json')); print('Valid JSON array with', len(data), 'items')"
|
|
```
|
|
|
|
### Check Python Dependencies
|
|
|
|
```bash
|
|
# Check if requests is installed
|
|
python -c "import requests; print('requests version:', requests.__version__)"
|
|
|
|
# Install if missing
|
|
pip install requests
|
|
```
|
|
|
|
## Advanced Usage
|
|
|
|
### Custom MusicBrainz Server
|
|
|
|
To use a different MusicBrainz server, modify the script:
|
|
|
|
```python
|
|
# In musicbrainz_cleaner.py, change:
|
|
self.base_url = "http://your-server:5001"
|
|
```
|
|
|
|
### Verbose Output
|
|
|
|
For debugging, you can modify the script to add more verbose output by uncommenting debug print statements.
|
|
|
|
## Command Line Shortcuts
|
|
|
|
### Common Aliases
|
|
|
|
Add these to your shell profile for convenience:
|
|
|
|
```bash
|
|
# Add to ~/.bashrc or ~/.zshrc
|
|
alias mbclean='python musicbrainz_cleaner.py'
|
|
alias mbclean-help='python musicbrainz_cleaner.py --help'
|
|
```
|
|
|
|
### Usage with Aliases
|
|
|
|
```bash
|
|
# Using alias
|
|
mbclean songs.json
|
|
|
|
# Show help
|
|
mbclean-help
|
|
```
|
|
|
|
## Integration Examples
|
|
|
|
### With Git
|
|
|
|
```bash
|
|
# Process files and commit changes
|
|
python musicbrainz_cleaner.py songs.json
|
|
git add songs_cleaned.json
|
|
git commit -m "Clean song metadata with MusicBrainz IDs"
|
|
```
|
|
|
|
### With Cron Jobs
|
|
|
|
```bash
|
|
# Add to crontab to process files daily
|
|
0 2 * * * cd /path/to/musicbrainz-cleaner && python musicbrainz_cleaner.py /path/to/songs.json
|
|
```
|
|
|
|
### With Shell Scripts
|
|
|
|
```bash
|
|
#!/bin/bash
|
|
# clean_songs.sh
|
|
INPUT_FILE="$1"
|
|
OUTPUT_FILE="${INPUT_FILE%.json}_cleaned.json"
|
|
|
|
python musicbrainz_cleaner.py "$INPUT_FILE" "$OUTPUT_FILE"
|
|
|
|
if [ $? -eq 0 ]; then
|
|
echo "Successfully cleaned $INPUT_FILE"
|
|
echo "Output saved to $OUTPUT_FILE"
|
|
else
|
|
echo "Error processing $INPUT_FILE"
|
|
exit 1
|
|
fi
|
|
```
|
|
|
|
## Command Reference Summary
|
|
|
|
| Command | Description |
|
|
|---------|-------------|
|
|
| `python musicbrainz_cleaner.py file.json` | Basic usage |
|
|
| `python musicbrainz_cleaner.py file.json output.json` | Custom output |
|
|
| `python musicbrainz_cleaner.py --help` | Show help |
|
|
| `python musicbrainz_cleaner.py --version` | Show version | |