diff --git a/PRD.md b/PRD.md index 64734d2..d483a87 100644 --- a/PRD.md +++ b/PRD.md @@ -374,6 +374,33 @@ python musicbrainz_cleaner.py --test-connection - **NEW**: Review and update band name protection list in `data/known_artists.json` - **NEW**: Monitor collaboration detection accuracy +### Operational Procedures + +#### After System Reboot +1. **Start Docker Desktop** (if auto-start not enabled) +2. **Restart MusicBrainz services**: + ```bash + cd musicbrainz-cleaner + ./restart_services.sh + ``` +3. **Wait for database initialization** (5-10 minutes) +4. **Test connection**: + ```bash + docker-compose run --rm musicbrainz-cleaner python3 quick_test_20.py + ``` + +#### Service Management +- **Start services**: `./start_services.sh` (full setup) or `./restart_services.sh` (quick restart) +- **Stop services**: `cd ../musicbrainz-docker && docker-compose down` +- **Check status**: `cd ../musicbrainz-docker && docker-compose ps` +- **View logs**: `cd ../musicbrainz-docker && docker-compose logs -f` + +#### Troubleshooting +- **Port conflicts**: Use `MUSICBRAINZ_WEB_SERVER_PORT=5001` environment variable +- **Container conflicts**: Run `docker-compose down` then restart +- **Database issues**: Check logs with `docker-compose logs -f db` +- **Memory issues**: Increase Docker Desktop memory allocation (8GB+ recommended) + ### Support - GitHub issues for bug reports - Documentation updates @@ -405,4 +432,11 @@ python musicbrainz_cleaner.py --test-connection - **Remove static caches** for better accuracy - **Database-first approach** ensures live data - **Fuzzy search thresholds** need tuning for different datasets -- **Connection pooling** would improve performance for large datasets \ No newline at end of file +- **Connection pooling** would improve performance for large datasets + +### Operational Insights +- **Docker Service Management**: MusicBrainz services require proper startup sequence and initialization time +- **Port Conflicts**: Common on macOS, requiring automatic detection and resolution +- **System Reboots**: Services need to be restarted after system reboots, but data persists in Docker volumes +- **Resource Requirements**: MusicBrainz services require significant memory (8GB+ recommended) and disk space +- **Platform Compatibility**: Apple Silicon (M1/M2) works but may show platform mismatch warnings \ No newline at end of file diff --git a/README.md b/README.md index 7d9b361..f3fe260 100644 --- a/README.md +++ b/README.md @@ -39,50 +39,81 @@ A powerful command-line tool that cleans and normalizes your song data using the ## ๐Ÿš€ Quick Start -### 1. Install Dependencies +### Option 1: Automated Setup (Recommended) + +1. **Start MusicBrainz services**: + ```bash + ./start_services.sh + ``` + This script will: + - Check for Docker and port conflicts + - Start all MusicBrainz services + - Wait for database initialization + - Create environment configuration + - Test the connection + +2. **Run the cleaner**: + ```bash + docker-compose run --rm musicbrainz-cleaner python3 -m src.cli.main --input data/songs.json --output cleaned_songs.json + ``` + +### Option 2: Manual Setup + +1. **Start MusicBrainz services manually**: + ```bash + cd ../musicbrainz-docker + MUSICBRAINZ_WEB_SERVER_PORT=5001 docker-compose up -d + ``` + Wait 5-10 minutes for database initialization. + +2. **Create environment configuration**: + ```bash + # Create .env file in musicbrainz-cleaner directory + cat > .env << EOF + DB_HOST=172.18.0.2 + DB_PORT=5432 + DB_NAME=musicbrainz_db + DB_USER=musicbrainz + DB_PASSWORD=musicbrainz + MUSICBRAINZ_WEB_SERVER_PORT=5001 + EOF + ``` + +3. **Run the cleaner**: + ```bash + docker-compose run --rm musicbrainz-cleaner python3 -m src.cli.main --input data/songs.json --output cleaned_songs.json + ``` + +### For detailed setup instructions, see [SETUP.md](SETUP.md) + +## ๐Ÿ”„ After System Reboot + +After restarting your Mac, you'll need to restart the MusicBrainz services: + +### Quick Restart (Recommended) ```bash -pip install requests psycopg2-binary fuzzywuzzy python-Levenshtein +# If Docker Desktop is already running +./restart_services.sh + +# Or manually +cd ../musicbrainz-docker && MUSICBRAINZ_WEB_SERVER_PORT=5001 docker-compose up -d ``` -### 2. Set Up MusicBrainz Server - -#### Option A: Docker (Recommended) +### Full Restart (If you have issues) ```bash -# Clone MusicBrainz Docker repository -git clone https://github.com/metabrainz/musicbrainz-docker.git -cd musicbrainz-docker - -# Update postgres.env to use correct database name -echo "POSTGRES_DB=musicbrainz_db" >> default/postgres.env - -# Start the server -docker-compose up -d - -# Wait for database to be ready (can take 10-15 minutes) -docker-compose logs -f musicbrainz +# Complete setup including Docker checks +./start_services.sh ``` -#### Option B: Manual Setup -1. Install PostgreSQL 12+ -2. Create database: `createdb musicbrainz_db` -3. Import MusicBrainz data dump -4. Start MusicBrainz server on port 8080 +### Auto-start Setup (Optional) +1. **Enable Docker Desktop auto-start**: + - Open Docker Desktop + - Go to Settings โ†’ General + - Check "Start Docker Desktop when you log in" -### 3. Test Connection -```bash -python musicbrainz_cleaner.py --test-connection -``` +2. **Then just run**: `./restart_services.sh` after each reboot -### 4. Run the Cleaner -```bash -# Use database access (recommended, faster) -python musicbrainz_cleaner.py your_songs.json - -# Force API mode (slower, fallback) -python musicbrainz_cleaner.py your_songs.json --use-api -``` - -That's it! Your cleaned data will be saved to `your_songs_cleaned.json` +**Note**: Your data is preserved in Docker volumes, so you don't need to reconfigure anything after a reboot. ## ๐Ÿ“‹ Requirements diff --git a/SETUP.md b/SETUP.md new file mode 100644 index 0000000..1ea4e1d --- /dev/null +++ b/SETUP.md @@ -0,0 +1,266 @@ +# MusicBrainz Cleaner Setup Guide + +This guide will help you set up the MusicBrainz database and Docker services needed to run the cleaner. + +## Prerequisites + +- Docker Desktop installed and running +- At least 8GB of available RAM +- At least 10GB of free disk space +- Git (to clone the repositories) + +## Step 1: Clone the MusicBrainz Server Repository + +```bash +# Clone the main MusicBrainz server repository (if not already done) +git clone https://github.com/metabrainz/musicbrainz-server.git +cd musicbrainz-server +``` + +## Step 2: Start the MusicBrainz Docker Services + +The MusicBrainz server uses Docker Compose to run multiple services including PostgreSQL, Solr search, Redis, and the web server. + +```bash +# Navigate to the musicbrainz-docker directory +cd musicbrainz-docker + +# Check if port 5000 is available (common conflict on macOS) +lsof -i :5000 + +# If port 5000 is in use, use port 5001 instead +MUSICBRAINZ_WEB_SERVER_PORT=5001 docker-compose up -d + +# Or if port 5000 is free, use the default +docker-compose up -d +``` + +### Troubleshooting Port Conflicts + +If you get a port conflict error: + +```bash +# Kill any process using port 5000 +lsof -ti:5000 | xargs kill -9 + +# Or use a different port +MUSICBRAINZ_WEB_SERVER_PORT=5001 docker-compose up -d +``` + +### Troubleshooting Container Conflicts + +If you get container name conflicts: + +```bash +# Remove existing containers +docker-compose down + +# Force remove conflicting containers +docker rm -f musicbrainz-docker-db-1 + +# Start fresh +docker-compose up -d +``` + +## Step 3: Wait for Services to Start + +The services take time to initialize, especially the database: + +```bash +# Check service status +docker-compose ps + +# Wait for all services to be healthy (this can take 5-10 minutes) +docker-compose logs -f db +``` + +**Important**: Wait until you see database initialization complete messages before proceeding. + +## Step 4: Verify Services Are Running + +```bash +# Check all containers are running +docker-compose ps + +# Test the web interface (if using port 5001) +curl http://localhost:5001 + +# Or if using default port 5000 +curl http://localhost:5000 +``` + +## Step 5: Set Environment Variables + +Create a `.env` file in the `musicbrainz-cleaner` directory: + +```bash +cd ../musicbrainz-cleaner + +# Create .env file +cat > .env << EOF +# Database connection (default Docker setup) +DB_HOST=172.18.0.2 +DB_PORT=5432 +DB_NAME=musicbrainz_db +DB_USER=musicbrainz +DB_PASSWORD=musicbrainz + +# MusicBrainz web server +MUSICBRAINZ_WEB_SERVER_PORT=5001 +EOF +``` + +**Note**: If you used the default port 5000, change `MUSICBRAINZ_WEB_SERVER_PORT=5001` to `MUSICBRAINZ_WEB_SERVER_PORT=5000`. + +## Step 6: Test the Connection + +```bash +# Run a simple test to verify everything is working +docker-compose run --rm musicbrainz-cleaner python3 quick_test_20.py +``` + +## Service Details + +The Docker Compose setup includes: + +- **PostgreSQL Database** (`db`): Main MusicBrainz database +- **Solr Search** (`search`): Full-text search engine +- **Redis** (`redis`): Caching and session storage +- **Message Queue** (`mq`): Background job processing +- **MusicBrainz Web Server** (`musicbrainz`): Main web application +- **Indexer** (`indexer`): Search index maintenance + +## Ports Used + +- **5000/5001**: MusicBrainz web server (configurable) +- **5432**: PostgreSQL database (internal) +- **8983**: Solr search (internal) +- **6379**: Redis (internal) +- **5672**: Message queue (internal) + +## Stopping Services + +```bash +# Stop all services +cd musicbrainz-docker +docker-compose down + +# To also remove volumes (WARNING: this deletes all data) +docker-compose down -v +``` + +## Restarting Services + +```bash +# Restart all services +docker-compose restart + +# Or restart specific service +docker-compose restart db +``` + +## Monitoring Services + +```bash +# View logs for all services +docker-compose logs -f + +# View logs for specific service +docker-compose logs -f db +docker-compose logs -f musicbrainz + +# Check resource usage +docker stats +``` + +## Troubleshooting + +### Database Connection Issues + +```bash +# Check if database is running +docker-compose ps db + +# Check database logs +docker-compose logs db + +# Test database connection +docker-compose exec db psql -U musicbrainz -d musicbrainz_db -c "SELECT 1;" +``` + +### Memory Issues + +If you encounter memory issues: + +```bash +# Increase Docker memory limit in Docker Desktop settings +# Recommended: 8GB minimum, 16GB preferred + +# Check current memory usage +docker stats +``` + +### Platform Issues (Apple Silicon) + +If you're on Apple Silicon (M1/M2) and see platform warnings: + +```bash +# The services will still work, but you may see warnings about platform mismatch +# This is normal and doesn't affect functionality +``` + +## Performance Tips + +1. **Allocate sufficient memory** to Docker Desktop (8GB+ recommended) +2. **Use SSD storage** for better database performance +3. **Close other resource-intensive applications** while running the services +4. **Wait for full initialization** before running tests + +## Next Steps + +Once the services are running successfully: + +1. Run the quick test: `python3 quick_test_20.py` +2. Run larger tests: `python3 bulk_test_1000.py` +3. Use the cleaner on your own data: `python3 -m src.cli.main --input your_file.json --output cleaned.json` + +## ๐Ÿ”„ After System Reboot + +After restarting your Mac, you'll need to restart the MusicBrainz services: + +### Quick Restart (Recommended) +```bash +# Navigate to musicbrainz-cleaner directory +cd /Users/mattbruce/Documents/Projects/musicbrainz-server/musicbrainz-cleaner + +# If Docker Desktop is already running +./restart_services.sh + +# Or manually +cd ../musicbrainz-docker && MUSICBRAINZ_WEB_SERVER_PORT=5001 docker-compose up -d +``` + +### Full Restart (If you have issues) +```bash +# Complete setup including Docker checks +./start_services.sh +``` + +### Auto-start Setup (Optional) +1. **Enable Docker Desktop auto-start**: + - Open Docker Desktop + - Go to Settings โ†’ General + - Check "Start Docker Desktop when you log in" + +2. **Then just run**: `./restart_services.sh` after each reboot + +**Note**: Your data is preserved in Docker volumes, so you don't need to reconfigure anything after a reboot. + +## Support + +If you encounter issues: + +1. Check the logs: `docker-compose logs -f` +2. Verify Docker has sufficient resources +3. Ensure all prerequisites are met +4. Try restarting the services: `docker-compose restart` \ No newline at end of file diff --git a/data/known_artists.json b/data/known_artists.json index 87da653..254c885 100644 --- a/data/known_artists.json +++ b/data/known_artists.json @@ -222,6 +222,7 @@ "The Proclaimers", "The Stanley Brothers", "The Statler Brothers", + "The Tamperer featuring Maya", "The Walker Brothers", "The Wilburn Brothers", "Thompson Twins", diff --git a/quick_test_20.py b/quick_test_20.py new file mode 100644 index 0000000..ed6e1d2 --- /dev/null +++ b/quick_test_20.py @@ -0,0 +1,108 @@ +#!/usr/bin/env python3 +""" +Quick test script for 20 random songs +Simple single-threaded approach +""" + +import sys +import json +import time +from pathlib import Path + +# Add the src directory to the path +sys.path.insert(0, '/app') +from src.cli.main import MusicBrainzCleaner + +def main(): + print('๐Ÿš€ Starting quick test with 20 random songs...') + + # Load songs + input_file = Path('data/songs.json') + if not input_file.exists(): + print('โŒ songs.json not found') + return + + with open(input_file, 'r') as f: + all_songs = json.load(f) + + print(f'๐Ÿ“Š Total songs available: {len(all_songs):,}') + + # Take 20 random songs + import random + sample_songs = random.sample(all_songs, 20) + print(f'๐ŸŽฏ Testing 20 random songs...') + + # Initialize cleaner + cleaner = MusicBrainzCleaner() + + # Process songs + found_artists = 0 + found_recordings = 0 + failed_songs = [] + + start_time = time.time() + + for i, song in enumerate(sample_songs, 1): + print(f' [{i:2d}/20] Processing: "{song.get("artist", "Unknown")}" - "{song.get("title", "Unknown")}"') + + try: + result = cleaner.clean_song(song) + + artist_found = 'mbid' in result + recording_found = 'recording_mbid' in result + + if artist_found and recording_found: + found_artists += 1 + found_recordings += 1 + print(f' โœ… Found both artist and recording') + else: + failed_songs.append({ + 'original': song, + 'cleaned': result, + 'artist_found': artist_found, + 'recording_found': recording_found, + 'artist_name': song.get('artist', 'Unknown'), + 'title': song.get('title', 'Unknown') + }) + print(f' โŒ Artist: {artist_found}, Recording: {recording_found}') + + except Exception as e: + print(f' ๐Ÿ’ฅ Error: {e}') + failed_songs.append({ + 'original': song, + 'cleaned': {'error': str(e)}, + 'artist_found': False, + 'recording_found': False, + 'artist_name': song.get('artist', 'Unknown'), + 'title': song.get('title', 'Unknown'), + 'error': str(e) + }) + + end_time = time.time() + processing_time = end_time - start_time + + # Calculate success rates + artist_success_rate = found_artists / 20 * 100 + recording_success_rate = found_recordings / 20 * 100 + failed_rate = len(failed_songs) / 20 * 100 + + print(f'\n๐Ÿ“Š Final Results:') + print(f' โฑ๏ธ Processing time: {processing_time:.2f} seconds') + print(f' ๐Ÿš€ Speed: {20/processing_time:.1f} songs/second') + print(f' โœ… Artists found: {found_artists}/20 ({artist_success_rate:.1f}%)') + print(f' โœ… Recordings found: {found_recordings}/20 ({recording_success_rate:.1f}%)') + print(f' โŒ Failed songs: {len(failed_songs)} ({failed_rate:.1f}%)') + + # Show failed songs + if failed_songs: + print(f'\n๐Ÿ” Failed songs:') + for i, failed in enumerate(failed_songs, 1): + print(f' [{i}] "{failed["artist_name"]}" - "{failed["title"]}"') + print(f' Artist found: {failed["artist_found"]}, Recording found: {failed["recording_found"]}') + if 'error' in failed: + print(f' Error: {failed["error"]}') + else: + print('\n๐ŸŽ‰ All songs processed successfully!') + +if __name__ == '__main__': + main() \ No newline at end of file diff --git a/restart_services.sh b/restart_services.sh new file mode 100755 index 0000000..8437faa --- /dev/null +++ b/restart_services.sh @@ -0,0 +1,19 @@ +#!/bin/bash + +# Quick restart script for after Mac reboots +# This assumes Docker Desktop is already running + +echo "๐Ÿ”„ Restarting MusicBrainz services..." + +# Navigate to musicbrainz-docker +cd ../musicbrainz-docker + +# Start services +MUSICBRAINZ_WEB_SERVER_PORT=5001 docker-compose up -d + +echo "โœ… Services started!" +echo "โณ Database may take 5-10 minutes to fully initialize" +echo "" +echo "๐Ÿ“Š Check status: docker-compose ps" +echo "๐Ÿ“‹ View logs: docker-compose logs -f db" +echo "๐Ÿงช Test when ready: cd ../musicbrainz-cleaner && docker-compose run --rm musicbrainz-cleaner python3 quick_test_20.py" \ No newline at end of file diff --git a/src/cli/main.py b/src/cli/main.py index 346306f..1d88bdd 100644 --- a/src/cli/main.py +++ b/src/cli/main.py @@ -276,8 +276,12 @@ class MusicBrainzCleaner: return collaborators - def clean_song(self, song: Dict[str, Any]) -> Dict[str, Any]: - print(f"Processing: {song.get('artist', 'Unknown')} - {song.get('title', 'Unknown')}") + def clean_song(self, song: Dict[str, Any]) -> Tuple[Dict[str, Any], bool]: + """ + Clean a single song and return (cleaned_song, success_status) + """ + original_artist = song.get('artist', '') + original_title = song.get('title', '') # Find artist MBID artist_mbid = self.find_artist_mbid(song.get('artist', '')) @@ -289,13 +293,11 @@ class MusicBrainzCleaner: has_collaboration = len(collaborators) > 0 if artist_mbid is None and has_collaboration: - print(f" ๐ŸŽฏ Collaboration detected: {song.get('artist')}") # Try to find recording using artist credit approach if self.use_database: result = self.db.find_artist_credit(song.get('artist', ''), song.get('title', '')) if result: artist_credit_id, artist_string, recording_mbid = result - print(f" โœ… Found recording: {song.get('title')} (MBID: {recording_mbid})") # Update with the correct artist credit song['artist'] = artist_string @@ -309,11 +311,9 @@ class MusicBrainzCleaner: if artist_result and isinstance(artist_result, tuple) and len(artist_result) >= 2: song['mbid'] = artist_result[1] # Set the main artist's MBID - print(f" โœ… Updated to: {song['artist']} - {song.get('title')}") - return song + return song, True else: - print(f" โŒ Could not find recording: {song.get('title')}") - return song + return song, False else: # Fallback to API method recording_mbid = self.find_recording_mbid(None, song.get('title', '')) @@ -323,37 +323,29 @@ class MusicBrainzCleaner: artist_string = self._build_artist_string(recording_info['artist-credit']) if artist_string: song['artist'] = artist_string - print(f" โœ… Updated to: {song['artist']} - {recording_info['title']}") song['title'] = recording_info['title'] song['recording_mbid'] = recording_mbid - return song - else: - print(f" โŒ Could not find recording: {song.get('title')}") - return song + return song, True + return song, False # Regular case (non-collaboration or collaboration not found) if not artist_mbid: - print(f" โŒ Could not find artist: {song.get('artist')}") - return song + return song, False # Get artist info artist_info = self.get_artist_info(artist_mbid) if artist_info: - print(f" โœ… Found artist: {artist_info['name']} (MBID: {artist_mbid})") song['artist'] = artist_info['name'] song['mbid'] = artist_mbid # Find recording MBID recording_mbid = self.find_recording_mbid(artist_mbid, song.get('title', '')) if not recording_mbid: - print(f" โŒ Could not find recording: {song.get('title')}") - return song + return song, False # Get recording info recording_info = self.get_recording_info(recording_mbid) if recording_info: - print(f" โœ… Found recording: {recording_info['title']} (MBID: {recording_mbid})") - # Update artist string if there are multiple artists, but preserve the artist MBID if self.use_database and recording_info.get('artist_credit'): song['artist'] = recording_info['artist_credit'] @@ -370,11 +362,11 @@ class MusicBrainzCleaner: song['title'] = recording_info['title'] song['recording_mbid'] = recording_mbid + return song, True - print(f" โœ… Updated to: {song['artist']} - {song['title']}") - return song + return song, False - def clean_songs_file(self, input_file: Path, output_file: Optional[Path] = None, limit: Optional[int] = None) -> Path: + def clean_songs_file(self, input_file: Path, output_file: Optional[Path] = None, limit: Optional[int] = None) -> Tuple[Path, List[Dict]]: try: # Read input file with open(input_file, 'r', encoding='utf-8') as f: @@ -382,7 +374,7 @@ class MusicBrainzCleaner: if not isinstance(songs, list): print("Error: Input file should contain a JSON array of songs") - return input_file + return input_file, [] # Apply limit if specified if limit is not None: @@ -399,11 +391,31 @@ class MusicBrainzCleaner: # Clean each song cleaned_songs = [] + failed_songs = [] + success_count = 0 + fail_count = 0 + for i, song in enumerate(songs, 1): - print(f"\n[{i}/{len(songs)}]", end=" ") - cleaned_song = self.clean_song(song) + cleaned_song, success = self.clean_song(song) cleaned_songs.append(cleaned_song) + if success: + success_count += 1 + print(f"[{i}/{len(songs)}] โœ… PASS") + else: + fail_count += 1 + print(f"[{i}/{len(songs)}] โŒ FAIL") + # Store failed song info for report + failed_songs.append({ + 'index': i, + 'original_artist': song.get('artist', ''), + 'original_title': song.get('title', ''), + 'cleaned_artist': cleaned_song.get('artist', ''), + 'cleaned_title': cleaned_song.get('title', ''), + 'has_mbid': 'mbid' in cleaned_song, + 'has_recording_mbid': 'recording_mbid' in cleaned_song + }) + # Only add delay for API calls, not database queries if not self.use_database: time.sleep(API_REQUEST_DELAY) @@ -412,21 +424,37 @@ class MusicBrainzCleaner: with open(output_file, 'w', encoding='utf-8') as f: json.dump(cleaned_songs, f, indent=2, ensure_ascii=False) - print(f"\n{PROGRESS_SEPARATOR}") - print(SUCCESS_MESSAGES['processing_complete']) - print(SUCCESS_MESSAGES['output_saved'].format(file_path=output_file)) + # Generate failure report + report_file = input_file.parent / f"{input_file.stem}_failure_report.json" + with open(report_file, 'w', encoding='utf-8') as f: + json.dump({ + 'summary': { + 'total_songs': len(songs), + 'successful': success_count, + 'failed': fail_count, + 'success_rate': f"{(success_count/len(songs)*100):.1f}%" + }, + 'failed_songs': failed_songs + }, f, indent=2, ensure_ascii=False) - return output_file + print(f"\n{PROGRESS_SEPARATOR}") + print(f"โœ… SUCCESS: {success_count} songs") + print(f"โŒ FAILED: {fail_count} songs") + print(f"๐Ÿ“Š SUCCESS RATE: {(success_count/len(songs)*100):.1f}%") + print(f"๐Ÿ’พ CLEANED DATA: {output_file}") + print(f"๐Ÿ“‹ FAILURE REPORT: {report_file}") + + return output_file, failed_songs except FileNotFoundError: print(f"Error: File '{input_file}' not found") - return input_file + return input_file, [] except json.JSONDecodeError: print(f"Error: Invalid JSON in file '{input_file}'") - return input_file + return input_file, [] except Exception as e: print(f"Error processing file: {e}") - return input_file + return input_file, [] finally: # Clean up database connection if self.use_database and hasattr(self, 'db'): @@ -601,7 +629,7 @@ def main() -> int: # Process the file cleaner = MusicBrainzCleaner(use_database=use_database) - result_path = cleaner.clean_songs_file(input_file, output_file, limit) + result_path, failed_songs = cleaner.clean_songs_file(input_file, output_file, limit) return ExitCode.SUCCESS diff --git a/start_services.sh b/start_services.sh new file mode 100755 index 0000000..f655bc0 --- /dev/null +++ b/start_services.sh @@ -0,0 +1,157 @@ +#!/bin/bash + +# MusicBrainz Cleaner - Quick Start Script +# This script automates the startup of MusicBrainz services + +set -e + +echo "๐Ÿš€ Starting MusicBrainz services..." + +# Colors for output +RED='\033[0;31m' +GREEN='\033[0;32m' +YELLOW='\033[1;33m' +BLUE='\033[0;34m' +NC='\033[0m' # No Color + +# Function to print colored output +print_status() { + echo -e "${BLUE}[INFO]${NC} $1" +} + +print_success() { + echo -e "${GREEN}[SUCCESS]${NC} $1" +} + +print_warning() { + echo -e "${YELLOW}[WARNING]${NC} $1" +} + +print_error() { + echo -e "${RED}[ERROR]${NC} $1" +} + +# Check if Docker is running +if ! docker info > /dev/null 2>&1; then + print_error "Docker is not running. Please start Docker Desktop first." + exit 1 +fi + +print_success "Docker is running" + +# Check if we're in the right directory +if [ ! -f "docker-compose.yml" ]; then + print_error "This script must be run from the musicbrainz-cleaner directory" + exit 1 +fi + +# Check if musicbrainz-docker directory exists +if [ ! -d "../musicbrainz-docker" ]; then + print_error "musicbrainz-docker directory not found. Please ensure you're in the musicbrainz-server directory." + exit 1 +fi + +# Navigate to musicbrainz-docker +cd ../musicbrainz-docker + +print_status "Checking for port conflicts..." + +# Check if port 5000 is available +if lsof -i :5000 > /dev/null 2>&1; then + print_warning "Port 5000 is in use. Using port 5001 instead." + PORT=5001 +else + print_success "Port 5000 is available" + PORT=5000 +fi + +# Stop any existing containers +print_status "Stopping existing containers..." +docker-compose down > /dev/null 2>&1 || true + +# Remove any conflicting containers +print_status "Cleaning up conflicting containers..." +docker rm -f musicbrainz-docker-db-1 > /dev/null 2>&1 || true + +# Start services +print_status "Starting MusicBrainz services on port $PORT..." +MUSICBRAINZ_WEB_SERVER_PORT=$PORT docker-compose up -d + +print_success "Services started successfully!" + +# Wait for database to be ready +print_status "Waiting for database to initialize (this may take 5-10 minutes)..." +print_status "You can monitor progress with: docker-compose logs -f db" + +# Check if database is ready +attempts=0 +max_attempts=60 +while [ $attempts -lt $max_attempts ]; do + if docker-compose exec -T db pg_isready -U musicbrainz > /dev/null 2>&1; then + print_success "Database is ready!" + break + fi + attempts=$((attempts + 1)) + print_status "Waiting for database... (attempt $attempts/$max_attempts)" + sleep 10 +done + +if [ $attempts -eq $max_attempts ]; then + print_warning "Database may still be initializing. You can check status with: docker-compose logs db" +fi + +# Create .env file in musicbrainz-cleaner directory +cd ../musicbrainz-cleaner + +print_status "Creating environment configuration..." + +cat > .env << EOF +# Database connection (default Docker setup) +DB_HOST=172.18.0.2 +DB_PORT=5432 +DB_NAME=musicbrainz_db +DB_USER=musicbrainz +DB_PASSWORD=musicbrainz + +# MusicBrainz web server +MUSICBRAINZ_WEB_SERVER_PORT=$PORT +EOF + +print_success "Environment configuration created" + +# Test connection +print_status "Testing connection..." +if docker-compose run --rm musicbrainz-cleaner python3 -c " +import sys +sys.path.insert(0, '/app') +from src.api.database import MusicBrainzDatabase +try: + db = MusicBrainzDatabase() + print('โœ… Database connection successful') +except Exception as e: + print(f'โŒ Database connection failed: {e}') + sys.exit(1) +" 2>/dev/null; then + print_success "Connection test passed!" +else + print_warning "Connection test failed. Services may still be initializing." +fi + +echo "" +print_success "MusicBrainz services are now running!" +echo "" +echo "๐Ÿ“Š Service Status:" +echo " - Web Server: http://localhost:$PORT" +echo " - Database: PostgreSQL (internal)" +echo " - Search: Solr (internal)" +echo "" +echo "๐Ÿงช Next steps:" +echo " 1. Run quick test: python3 quick_test_20.py" +echo " 2. Run larger test: python3 bulk_test_1000.py" +echo " 3. Use cleaner: python3 -m src.cli.main --input your_file.json --output cleaned.json" +echo "" +echo "๐Ÿ“‹ Useful commands:" +echo " - View logs: cd ../musicbrainz-docker && docker-compose logs -f" +echo " - Stop services: cd ../musicbrainz-docker && docker-compose down" +echo " - Check status: cd ../musicbrainz-docker && docker-compose ps" +echo "" \ No newline at end of file