# MusicBrainz Cleaner Setup Guide This guide will help you set up the MusicBrainz database and Docker services needed to run the cleaner. ## Prerequisites - Docker Desktop installed and running - At least 8GB of available RAM - At least 10GB of free disk space - Git (to clone the repositories) ## Step 1: Clone the MusicBrainz Server Repository ```bash # Clone the main MusicBrainz server repository (if not already done) git clone https://github.com/metabrainz/musicbrainz-server.git cd musicbrainz-server ``` ## Step 2: Start the MusicBrainz Docker Services The MusicBrainz server uses Docker Compose to run multiple services including PostgreSQL, Solr search, Redis, and the web server. ```bash # Navigate to the musicbrainz-docker directory cd musicbrainz-docker # Check if port 5000 is available (common conflict on macOS) lsof -i :5000 # If port 5000 is in use, use port 5001 instead MUSICBRAINZ_WEB_SERVER_PORT=5001 docker-compose up -d # Or if port 5000 is free, use the default docker-compose up -d ``` ### Troubleshooting Port Conflicts If you get a port conflict error: ```bash # Kill any process using port 5000 lsof -ti:5000 | xargs kill -9 # Or use a different port MUSICBRAINZ_WEB_SERVER_PORT=5001 docker-compose up -d ``` ### Troubleshooting Container Conflicts If you get container name conflicts: ```bash # Remove existing containers docker-compose down # Force remove conflicting containers docker rm -f musicbrainz-docker-db-1 # Start fresh docker-compose up -d ``` ## Step 3: Wait for Services to Start The services take time to initialize, especially the database: ```bash # Check service status docker-compose ps # Wait for all services to be healthy (this can take 5-10 minutes) docker-compose logs -f db ``` **Important**: Wait until you see database initialization complete messages before proceeding. ## Step 4: Verify Services Are Running ```bash # Check all containers are running docker-compose ps # Test the web interface (if using port 5001) curl http://localhost:5001 # Or if using default port 5000 curl http://localhost:5000 ``` ## Step 5: Set Environment Variables Create a `.env` file in the `musicbrainz-cleaner` directory: ```bash cd ../musicbrainz-cleaner # Create .env file cat > .env << EOF # Database connection (default Docker setup) DB_HOST=172.18.0.2 DB_PORT=5432 DB_NAME=musicbrainz_db DB_USER=musicbrainz DB_PASSWORD=musicbrainz # MusicBrainz web server MUSICBRAINZ_WEB_SERVER_PORT=5001 EOF ``` **Note**: If you used the default port 5000, change `MUSICBRAINZ_WEB_SERVER_PORT=5001` to `MUSICBRAINZ_WEB_SERVER_PORT=5000`. ## Step 6: Test the Connection ```bash # Run a simple test to verify everything is working docker-compose run --rm musicbrainz-cleaner python3 quick_test_20.py ``` ## Service Details The Docker Compose setup includes: - **PostgreSQL Database** (`db`): Main MusicBrainz database - **Solr Search** (`search`): Full-text search engine - **Redis** (`redis`): Caching and session storage - **Message Queue** (`mq`): Background job processing - **MusicBrainz Web Server** (`musicbrainz`): Main web application - **Indexer** (`indexer`): Search index maintenance ## Ports Used - **5000/5001**: MusicBrainz web server (configurable) - **5432**: PostgreSQL database (internal) - **8983**: Solr search (internal) - **6379**: Redis (internal) - **5672**: Message queue (internal) ## Stopping Services ```bash # Stop all services cd musicbrainz-docker docker-compose down # To also remove volumes (WARNING: this deletes all data) docker-compose down -v ``` ## Restarting Services ```bash # Restart all services docker-compose restart # Or restart specific service docker-compose restart db ``` ## Monitoring Services ```bash # View logs for all services docker-compose logs -f # View logs for specific service docker-compose logs -f db docker-compose logs -f musicbrainz # Check resource usage docker stats ``` ## Troubleshooting ### Database Connection Issues ```bash # Check if database is running docker-compose ps db # Check database logs docker-compose logs db # Test database connection docker-compose exec db psql -U musicbrainz -d musicbrainz_db -c "SELECT 1;" ``` ### Memory Issues If you encounter memory issues: ```bash # Increase Docker memory limit in Docker Desktop settings # Recommended: 8GB minimum, 16GB preferred # Check current memory usage docker stats ``` ### Platform Issues (Apple Silicon) If you're on Apple Silicon (M1/M2) and see platform warnings: ```bash # The services will still work, but you may see warnings about platform mismatch # This is normal and doesn't affect functionality ``` ## Performance Tips 1. **Allocate sufficient memory** to Docker Desktop (8GB+ recommended) 2. **Use SSD storage** for better database performance 3. **Close other resource-intensive applications** while running the services 4. **Wait for full initialization** before running tests ## Next Steps Once the services are running successfully: 1. Run the quick test: `python3 quick_test_20.py` 2. Run larger tests: `python3 bulk_test_1000.py` 3. Use the cleaner on your own data: `python3 -m src.cli.main --input your_file.json --output cleaned.json` ## 🔄 After System Reboot After restarting your Mac, you'll need to restart the MusicBrainz services: ### Quick Restart (Recommended) ```bash # Navigate to musicbrainz-cleaner directory cd /Users/mattbruce/Documents/Projects/musicbrainz-server/musicbrainz-cleaner # If Docker Desktop is already running ./restart_services.sh # Or manually cd ../musicbrainz-docker && MUSICBRAINZ_WEB_SERVER_PORT=5001 docker-compose up -d ``` ### Full Restart (If you have issues) ```bash # Complete setup including Docker checks ./start_services.sh ``` ### Auto-start Setup (Optional) 1. **Enable Docker Desktop auto-start**: - Open Docker Desktop - Go to Settings → General - Check "Start Docker Desktop when you log in" 2. **Then just run**: `./restart_services.sh` after each reboot **Note**: Your data is preserved in Docker volumes, so you don't need to reconfigure anything after a reboot. ## Support If you encounter issues: 1. Check the logs: `docker-compose logs -f` 2. Verify Docker has sufficient resources 3. Ensure all prerequisites are met 4. Try restarting the services: `docker-compose restart`