musicbrainz-cleaner/SETUP.md

6.2 KiB

MusicBrainz Cleaner Setup Guide

This guide will help you set up the MusicBrainz database and Docker services needed to run the cleaner.

Prerequisites

  • Docker Desktop installed and running
  • At least 8GB of available RAM
  • At least 10GB of free disk space
  • Git (to clone the repositories)

Step 1: Clone the MusicBrainz Server Repository

# Clone the main MusicBrainz server repository (if not already done)
git clone https://github.com/metabrainz/musicbrainz-server.git
cd musicbrainz-server

Step 2: Start the MusicBrainz Docker Services

The MusicBrainz server uses Docker Compose to run multiple services including PostgreSQL, Solr search, Redis, and the web server.

# Navigate to the musicbrainz-docker directory
cd musicbrainz-docker

# Check if port 5000 is available (common conflict on macOS)
lsof -i :5000

# If port 5000 is in use, use port 5001 instead
MUSICBRAINZ_WEB_SERVER_PORT=5001 docker-compose up -d

# Or if port 5000 is free, use the default
docker-compose up -d

Troubleshooting Port Conflicts

If you get a port conflict error:

# Kill any process using port 5000
lsof -ti:5000 | xargs kill -9

# Or use a different port
MUSICBRAINZ_WEB_SERVER_PORT=5001 docker-compose up -d

Troubleshooting Container Conflicts

If you get container name conflicts:

# Remove existing containers
docker-compose down

# Force remove conflicting containers
docker rm -f musicbrainz-docker-db-1

# Start fresh
docker-compose up -d

Step 3: Wait for Services to Start

The services take time to initialize, especially the database:

# Check service status
docker-compose ps

# Wait for all services to be healthy (this can take 5-10 minutes)
docker-compose logs -f db

Important: Wait until you see database initialization complete messages before proceeding.

Step 4: Verify Services Are Running

# Check all containers are running
docker-compose ps

# Test the web interface (if using port 5001)
curl http://localhost:5001

# Or if using default port 5000
curl http://localhost:5000

Step 5: Set Environment Variables

Create a .env file in the musicbrainz-cleaner directory:

cd ../musicbrainz-cleaner

# Create .env file
cat > .env << EOF
# Database connection (default Docker setup)
DB_HOST=172.18.0.2
DB_PORT=5432
DB_NAME=musicbrainz_db
DB_USER=musicbrainz
DB_PASSWORD=musicbrainz

# MusicBrainz web server
MUSICBRAINZ_WEB_SERVER_PORT=5001
EOF

Note: If you used the default port 5000, change MUSICBRAINZ_WEB_SERVER_PORT=5001 to MUSICBRAINZ_WEB_SERVER_PORT=5000.

Step 6: Test the Connection

# Run a simple test to verify everything is working
docker-compose run --rm musicbrainz-cleaner python3 quick_test_20.py

Service Details

The Docker Compose setup includes:

  • PostgreSQL Database (db): Main MusicBrainz database
  • Solr Search (search): Full-text search engine
  • Redis (redis): Caching and session storage
  • Message Queue (mq): Background job processing
  • MusicBrainz Web Server (musicbrainz): Main web application
  • Indexer (indexer): Search index maintenance

Ports Used

  • 5000/5001: MusicBrainz web server (configurable)
  • 5432: PostgreSQL database (internal)
  • 8983: Solr search (internal)
  • 6379: Redis (internal)
  • 5672: Message queue (internal)

Stopping Services

# Stop all services
cd musicbrainz-docker
docker-compose down

# To also remove volumes (WARNING: this deletes all data)
docker-compose down -v

Restarting Services

# Restart all services
docker-compose restart

# Or restart specific service
docker-compose restart db

Monitoring Services

# View logs for all services
docker-compose logs -f

# View logs for specific service
docker-compose logs -f db
docker-compose logs -f musicbrainz

# Check resource usage
docker stats

Troubleshooting

Database Connection Issues

# Check if database is running
docker-compose ps db

# Check database logs
docker-compose logs db

# Test database connection
docker-compose exec db psql -U musicbrainz -d musicbrainz_db -c "SELECT 1;"

Memory Issues

If you encounter memory issues:

# Increase Docker memory limit in Docker Desktop settings
# Recommended: 8GB minimum, 16GB preferred

# Check current memory usage
docker stats

Platform Issues (Apple Silicon)

If you're on Apple Silicon (M1/M2) and see platform warnings:

# The services will still work, but you may see warnings about platform mismatch
# This is normal and doesn't affect functionality

Performance Tips

  1. Allocate sufficient memory to Docker Desktop (8GB+ recommended)
  2. Use SSD storage for better database performance
  3. Close other resource-intensive applications while running the services
  4. Wait for full initialization before running tests

Next Steps

Once the services are running successfully:

  1. Run the quick test: python3 quick_test_20.py
  2. Run larger tests: python3 bulk_test_1000.py
  3. Use the cleaner on your own data: python3 -m src.cli.main --input your_file.json --output cleaned.json

🔄 After System Reboot

After restarting your Mac, you'll need to restart the MusicBrainz services:

# Navigate to musicbrainz-cleaner directory
cd /Users/mattbruce/Documents/Projects/musicbrainz-server/musicbrainz-cleaner

# If Docker Desktop is already running
./restart_services.sh

# Or manually
cd ../musicbrainz-docker && MUSICBRAINZ_WEB_SERVER_PORT=5001 docker-compose up -d

Full Restart (If you have issues)

# Complete setup including Docker checks
./start_services.sh

Auto-start Setup (Optional)

  1. Enable Docker Desktop auto-start:

    • Open Docker Desktop
    • Go to Settings → General
    • Check "Start Docker Desktop when you log in"
  2. Then just run: ./restart_services.sh after each reboot

Note: Your data is preserved in Docker volumes, so you don't need to reconfigure anything after a reboot.

Support

If you encounter issues:

  1. Check the logs: docker-compose logs -f
  2. Verify Docker has sufficient resources
  3. Ensure all prerequisites are met
  4. Try restarting the services: docker-compose restart