musicbrainz-cleaner/SETUP.md

266 lines
6.2 KiB
Markdown

# MusicBrainz Cleaner Setup Guide
This guide will help you set up the MusicBrainz database and Docker services needed to run the cleaner.
## Prerequisites
- Docker Desktop installed and running
- At least 8GB of available RAM
- At least 10GB of free disk space
- Git (to clone the repositories)
## Step 1: Clone the MusicBrainz Server Repository
```bash
# Clone the main MusicBrainz server repository (if not already done)
git clone https://github.com/metabrainz/musicbrainz-server.git
cd musicbrainz-server
```
## Step 2: Start the MusicBrainz Docker Services
The MusicBrainz server uses Docker Compose to run multiple services including PostgreSQL, Solr search, Redis, and the web server.
```bash
# Navigate to the musicbrainz-docker directory
cd musicbrainz-docker
# Check if port 5000 is available (common conflict on macOS)
lsof -i :5000
# If port 5000 is in use, use port 5001 instead
MUSICBRAINZ_WEB_SERVER_PORT=5001 docker-compose up -d
# Or if port 5000 is free, use the default
docker-compose up -d
```
### Troubleshooting Port Conflicts
If you get a port conflict error:
```bash
# Kill any process using port 5000
lsof -ti:5000 | xargs kill -9
# Or use a different port
MUSICBRAINZ_WEB_SERVER_PORT=5001 docker-compose up -d
```
### Troubleshooting Container Conflicts
If you get container name conflicts:
```bash
# Remove existing containers
docker-compose down
# Force remove conflicting containers
docker rm -f musicbrainz-docker-db-1
# Start fresh
docker-compose up -d
```
## Step 3: Wait for Services to Start
The services take time to initialize, especially the database:
```bash
# Check service status
docker-compose ps
# Wait for all services to be healthy (this can take 5-10 minutes)
docker-compose logs -f db
```
**Important**: Wait until you see database initialization complete messages before proceeding.
## Step 4: Verify Services Are Running
```bash
# Check all containers are running
docker-compose ps
# Test the web interface (if using port 5001)
curl http://localhost:5001
# Or if using default port 5000
curl http://localhost:5000
```
## Step 5: Set Environment Variables
Create a `.env` file in the `musicbrainz-cleaner` directory:
```bash
cd ../musicbrainz-cleaner
# Create .env file
cat > .env << EOF
# Database connection (default Docker setup)
DB_HOST=172.18.0.2
DB_PORT=5432
DB_NAME=musicbrainz_db
DB_USER=musicbrainz
DB_PASSWORD=musicbrainz
# MusicBrainz web server
MUSICBRAINZ_WEB_SERVER_PORT=5001
EOF
```
**Note**: If you used the default port 5000, change `MUSICBRAINZ_WEB_SERVER_PORT=5001` to `MUSICBRAINZ_WEB_SERVER_PORT=5000`.
## Step 6: Test the Connection
```bash
# Run a simple test to verify everything is working
docker-compose run --rm musicbrainz-cleaner python3 quick_test_20.py
```
## Service Details
The Docker Compose setup includes:
- **PostgreSQL Database** (`db`): Main MusicBrainz database
- **Solr Search** (`search`): Full-text search engine
- **Redis** (`redis`): Caching and session storage
- **Message Queue** (`mq`): Background job processing
- **MusicBrainz Web Server** (`musicbrainz`): Main web application
- **Indexer** (`indexer`): Search index maintenance
## Ports Used
- **5000/5001**: MusicBrainz web server (configurable)
- **5432**: PostgreSQL database (internal)
- **8983**: Solr search (internal)
- **6379**: Redis (internal)
- **5672**: Message queue (internal)
## Stopping Services
```bash
# Stop all services
cd musicbrainz-docker
docker-compose down
# To also remove volumes (WARNING: this deletes all data)
docker-compose down -v
```
## Restarting Services
```bash
# Restart all services
docker-compose restart
# Or restart specific service
docker-compose restart db
```
## Monitoring Services
```bash
# View logs for all services
docker-compose logs -f
# View logs for specific service
docker-compose logs -f db
docker-compose logs -f musicbrainz
# Check resource usage
docker stats
```
## Troubleshooting
### Database Connection Issues
```bash
# Check if database is running
docker-compose ps db
# Check database logs
docker-compose logs db
# Test database connection
docker-compose exec db psql -U musicbrainz -d musicbrainz_db -c "SELECT 1;"
```
### Memory Issues
If you encounter memory issues:
```bash
# Increase Docker memory limit in Docker Desktop settings
# Recommended: 8GB minimum, 16GB preferred
# Check current memory usage
docker stats
```
### Platform Issues (Apple Silicon)
If you're on Apple Silicon (M1/M2) and see platform warnings:
```bash
# The services will still work, but you may see warnings about platform mismatch
# This is normal and doesn't affect functionality
```
## Performance Tips
1. **Allocate sufficient memory** to Docker Desktop (8GB+ recommended)
2. **Use SSD storage** for better database performance
3. **Close other resource-intensive applications** while running the services
4. **Wait for full initialization** before running tests
## Next Steps
Once the services are running successfully:
1. Run the quick test: `python3 quick_test_20.py`
2. Run larger tests: `python3 bulk_test_1000.py`
3. Use the cleaner on your own data: `python3 -m src.cli.main --input your_file.json --output cleaned.json`
## 🔄 After System Reboot
After restarting your Mac, you'll need to restart the MusicBrainz services:
### Quick Restart (Recommended)
```bash
# Navigate to musicbrainz-cleaner directory
cd /Users/mattbruce/Documents/Projects/musicbrainz-server/musicbrainz-cleaner
# If Docker Desktop is already running
./restart_services.sh
# Or manually
cd ../musicbrainz-docker && MUSICBRAINZ_WEB_SERVER_PORT=5001 docker-compose up -d
```
### Full Restart (If you have issues)
```bash
# Complete setup including Docker checks
./start_services.sh
```
### Auto-start Setup (Optional)
1. **Enable Docker Desktop auto-start**:
- Open Docker Desktop
- Go to Settings → General
- Check "Start Docker Desktop when you log in"
2. **Then just run**: `./restart_services.sh` after each reboot
**Note**: Your data is preserved in Docker volumes, so you don't need to reconfigure anything after a reboot.
## Support
If you encounter issues:
1. Check the logs: `docker-compose logs -f`
2. Verify Docker has sufficient resources
3. Ensure all prerequisites are met
4. Try restarting the services: `docker-compose restart`