# MusicBrainz Data Cleaner - CLI Commands Reference ## Overview The MusicBrainz Data Cleaner is a command-line interface (CLI) tool that processes JSON song data files and cleans/normalizes the metadata using the MusicBrainz database. ## Basic Command Structure ```bash python musicbrainz_cleaner.py [output_file] [options] ``` ## Command Arguments ### Required Arguments | Argument | Type | Description | Example | |----------|------|-------------|---------| | `input_file` | string | Path to the JSON file containing song data | `my_songs.json` | ### Optional Arguments | Argument | Type | Description | Example | |----------|------|-------------|---------| | `output_file` | string | Path for the cleaned output file | `cleaned_songs.json` | | `--help` | flag | Show help information | `--help` | | `--version` | flag | Show version information | `--version` | ## Command Examples ### Basic Usage ```bash # Clean songs and save to auto-generated filename python musicbrainz_cleaner.py songs.json # Output: songs_cleaned.json ``` ### Custom Output File ```bash # Specify custom output filename python musicbrainz_cleaner.py songs.json cleaned_songs.json ``` ### Help and Information ```bash # Show help information python musicbrainz_cleaner.py --help # Show version information python musicbrainz_cleaner.py --version ``` ## Input File Format The input file must be a valid JSON file containing an array of song objects: ```json [ { "artist": "ACDC", "title": "Shot In The Dark", "disabled": false, "favorite": true, "guid": "8946008c-7acc-d187-60e6-5286e55ad502", "path": "z://MP4\\ACDC - Shot In The Dark (Karaoke Version).mp4" } ] ``` ### Required Fields - `artist`: The artist name (string) - `title`: The song title (string) ### Optional Fields Any additional fields will be preserved in the output: - `disabled`: Boolean flag - `favorite`: Boolean flag - `guid`: Unique identifier - `path`: File path - Any other custom fields ## Output File Format The output file will contain the same structure with cleaned data and added MBID fields: ```json [ { "artist": "AC/DC", "title": "Shot in the Dark", "disabled": false, "favorite": true, "guid": "8946008c-7acc-d187-60e6-5286e55ad502", "path": "z://MP4\\ACDC - Shot In The Dark (Karaoke Version).mp4", "mbid": "66c662b6-6e2f-4930-8610-912e24c63ed1", "recording_mbid": "cf8b5cd0-d97c-413d-882f-fc422a2e57db" } ] ``` ### Added Fields - `mbid`: MusicBrainz Artist ID (string) - `recording_mbid`: MusicBrainz Recording ID (string) ## Command Line Options ### Help Option ```bash python musicbrainz_cleaner.py --help ``` **Output:** ``` Usage: python musicbrainz_cleaner.py [output_file.json] MusicBrainz Data Cleaner - Clean and normalize song data using MusicBrainz Arguments: input_file.json JSON file containing array of song objects output_file.json Optional: Output file for cleaned data (default: input_file_cleaned.json) Examples: python musicbrainz_cleaner.py songs.json python musicbrainz_cleaner.py songs.json cleaned_songs.json Requirements: - MusicBrainz server running on http://localhost:5001 - Python 3.6+ with requests library ``` ### Version Option ```bash python musicbrainz_cleaner.py --version ``` **Output:** ``` MusicBrainz Data Cleaner v1.0.0 ``` ## Error Messages and Exit Codes ### Exit Codes | Code | Meaning | Description | |------|---------|-------------| | 0 | Success | Processing completed successfully | | 1 | Error | General error occurred | | 2 | Usage Error | Invalid command line arguments | ### Common Error Messages #### File Not Found ``` Error: File 'songs.json' not found ``` #### Invalid JSON ``` Error: Invalid JSON in file 'songs.json' ``` #### Invalid Input Format ``` Error: Input file should contain a JSON array of songs ``` #### Connection Error ``` Error searching for artist 'Artist Name': Connection refused ``` #### Missing Dependencies ``` ModuleNotFoundError: No module named 'requests' ``` ## Processing Output ### Progress Indicators ``` Processing 3 songs... ================================================== [1/3] Processing: ACDC - Shot In The Dark ✅ Found artist: AC/DC (MBID: 66c662b6-6e2f-4930-8610-912e24c63ed1) ✅ Found recording: Shot in the Dark (MBID: cf8b5cd0-d97c-413d-882f-fc422a2e57db) ✅ Updated to: AC/DC - Shot in the Dark [2/3] Processing: Bruno Mars ft. Cardi B - Finesse Remix ❌ Could not find artist: Bruno Mars ft. Cardi B [3/3] Processing: Taylor Swift - Love Story ✅ Found artist: Taylor Swift (MBID: 20244d07-534f-4eff-b4d4-930878889970) ✅ Found recording: Love Story (MBID: d783e6c5-761f-4fc3-bfcf-6089cdfc8f96) ✅ Updated to: Taylor Swift - Love Story ================================================== ✅ Processing complete! 📁 Output saved to: songs_cleaned.json ``` ### Status Indicators | Symbol | Meaning | Description | |--------|---------|-------------| | ✅ | Success | Operation completed successfully | | ❌ | Error | Operation failed | | 🔄 | Processing | Currently processing | ## Batch Processing ### Multiple Files To process multiple files, you can use shell scripting: ```bash # Process all JSON files in current directory for file in *.json; do python musicbrainz_cleaner.py "$file" done ``` ### Large Files For large files, the tool processes songs one at a time with a 0.1-second delay between API calls to be respectful to the MusicBrainz server. ## Environment Variables The tool uses the following default configuration: | Setting | Default | Description | |---------|---------|-------------| | MusicBrainz URL | `http://localhost:5001` | Local MusicBrainz server URL | | API Delay | `0.1` seconds | Delay between API calls | ## Troubleshooting Commands ### Check MusicBrainz Server Status ```bash # Test if server is running curl -I http://localhost:5001 # Test API endpoint curl http://localhost:5001/ws/2/artist/?query=name:AC/DC&fmt=json ``` ### Validate JSON File ```bash # Check if JSON is valid python -m json.tool songs.json # Check JSON structure python -c "import json; data=json.load(open('songs.json')); print('Valid JSON array with', len(data), 'items')" ``` ### Check Python Dependencies ```bash # Check if requests is installed python -c "import requests; print('requests version:', requests.__version__)" # Install if missing pip install requests ``` ## Advanced Usage ### Custom MusicBrainz Server To use a different MusicBrainz server, modify the script: ```python # In musicbrainz_cleaner.py, change: self.base_url = "http://your-server:5001" ``` ### Verbose Output For debugging, you can modify the script to add more verbose output by uncommenting debug print statements. ## Command Line Shortcuts ### Common Aliases Add these to your shell profile for convenience: ```bash # Add to ~/.bashrc or ~/.zshrc alias mbclean='python musicbrainz_cleaner.py' alias mbclean-help='python musicbrainz_cleaner.py --help' ``` ### Usage with Aliases ```bash # Using alias mbclean songs.json # Show help mbclean-help ``` ## Integration Examples ### With Git ```bash # Process files and commit changes python musicbrainz_cleaner.py songs.json git add songs_cleaned.json git commit -m "Clean song metadata with MusicBrainz IDs" ``` ### With Cron Jobs ```bash # Add to crontab to process files daily 0 2 * * * cd /path/to/musicbrainz-cleaner && python musicbrainz_cleaner.py /path/to/songs.json ``` ### With Shell Scripts ```bash #!/bin/bash # clean_songs.sh INPUT_FILE="$1" OUTPUT_FILE="${INPUT_FILE%.json}_cleaned.json" python musicbrainz_cleaner.py "$INPUT_FILE" "$OUTPUT_FILE" if [ $? -eq 0 ]; then echo "Successfully cleaned $INPUT_FILE" echo "Output saved to $OUTPUT_FILE" else echo "Error processing $INPUT_FILE" exit 1 fi ``` ## Command Reference Summary | Command | Description | |---------|-------------| | `python musicbrainz_cleaner.py file.json` | Basic usage | | `python musicbrainz_cleaner.py file.json output.json` | Custom output | | `python musicbrainz_cleaner.py --help` | Show help | | `python musicbrainz_cleaner.py --version` | Show version |