Signed-off-by: Matt Bruce <mbrucedogs@gmail.com>

This commit is contained in:
Matt Bruce 2025-07-31 18:19:51 -05:00
parent 4bf359ee5d
commit ddbc6a9ebc
5 changed files with 422 additions and 24 deletions

76
PRD.md
View File

@ -8,6 +8,49 @@
**Date:** December 19, 2024
**Status:** Production Ready with Advanced Database Integration ✅
## 🚀 Quick Start for New Sessions
**For new chat sessions or after system reboots, follow this exact sequence:**
### 1. Start MusicBrainz Services
```bash
# Quick restart (recommended)
./restart_services.sh
# Or full restart (if you have issues)
./start_services.sh
```
### 2. Wait for Services to Initialize
- **Database**: 5-10 minutes to fully load
- **Web server**: 2-3 minutes to start responding
- **Check status**: `cd ../musicbrainz-docker && docker-compose ps`
### 3. Verify Services Are Ready
```bash
# Test web server
curl -s http://localhost:5001 | head -5
# Test database (should show 2.6M+ artists)
docker-compose exec db psql -U musicbrainz -d musicbrainz_db -c "SELECT COUNT(*) FROM artist;"
# Test cleaner connection
docker-compose run --rm musicbrainz-cleaner python3 -c "from src.api.database import MusicBrainzDatabase; db = MusicBrainzDatabase(); print('Connection result:', db.connect())"
```
### 4. Run Tests
```bash
# Test 100 random songs
docker-compose run --rm musicbrainz-cleaner python3 test_100_random.py
# Or other test scripts
docker-compose run --rm musicbrainz-cleaner python3 [script_name].py
```
**⚠️ Critical**: Always run scripts via Docker - the cleaner cannot connect to the database directly from outside the container.
**📋 Troubleshooting**: See `TROUBLESHOOTING.md` for common issues and solutions.
## Problem Statement
Users have song data in JSON format with inconsistent artist names, song titles, and missing MusicBrainz identifiers. They need a tool to:
@ -401,6 +444,35 @@ python musicbrainz_cleaner.py --test-connection
- **Database issues**: Check logs with `docker-compose logs -f db`
- **Memory issues**: Increase Docker Desktop memory allocation (8GB+ recommended)
#### Critical Startup Issues & Solutions
**Issue 1: Database Connection Refused**
- **Symptoms**: Cleaner reports "Connection refused" when trying to connect to database
- **Root Cause**: Database container not fully initialized or wrong host configuration
- **Solution**:
```bash
# Check database status
docker-compose logs db | tail -10
# Verify database is ready
docker-compose exec db psql -U musicbrainz -d musicbrainz_db -c "SELECT COUNT(*) FROM artist;"
```
**Issue 2: Wrong Database Host Configuration**
- **Symptoms**: Cleaner tries to connect to `172.18.0.2` but fails
- **Root Cause**: Hardcoded IP address in database connection
- **Solution**: Use Docker service name `db` instead of IP address in `src/api/database.py`
**Issue 3: Test Script Logic Error**
- **Symptoms**: Test shows 0% success rate despite finding artists
- **Root Cause**: Test script checking `'mbid' in result` where `result` is a tuple
- **Solution**: Extract song dictionary from tuple: `cleaned_song, success = result`
**Issue 4: Services Not Fully Initialized**
- **Symptoms**: API returns empty results even though database has data
- **Root Cause**: MusicBrainz web server still starting up
- **Solution**: Wait for services to be fully ready and verify with health checks
### Support
- GitHub issues for bug reports
- Documentation updates
@ -439,4 +511,6 @@ python musicbrainz_cleaner.py --test-connection
- **Port Conflicts**: Common on macOS, requiring automatic detection and resolution
- **System Reboots**: Services need to be restarted after system reboots, but data persists in Docker volumes
- **Resource Requirements**: MusicBrainz services require significant memory (8GB+ recommended) and disk space
- **Platform Compatibility**: Apple Silicon (M1/M2) works but may show platform mismatch warnings
- **Platform Compatibility**: Apple Silicon (M1/M2) works but may show platform mismatch warnings
- **Database Connection Issues**: Common startup problems include wrong host configuration and incomplete initialization
- **Test Script Logic**: Critical to handle tuple return values from cleaner methods correctly

119
README.md
View File

@ -2,6 +2,49 @@
A powerful command-line tool that cleans and normalizes your song data using the MusicBrainz database. **Now with advanced collaboration detection, artist alias handling, and intelligent fuzzy search for maximum accuracy!**
## 🚀 Quick Start for New Sessions
**If you're starting fresh or after a reboot, follow this exact sequence:**
### 1. Start MusicBrainz Services
```bash
# Quick restart (recommended)
./restart_services.sh
# Or full restart (if you have issues)
./start_services.sh
```
### 2. Wait for Services to Initialize
- **Database**: 5-10 minutes to fully load
- **Web server**: 2-3 minutes to start responding
- **Check status**: `cd ../musicbrainz-docker && docker-compose ps`
### 3. Verify Services Are Ready
```bash
# Test web server
curl -s http://localhost:5001 | head -5
# Test database (should show 2.6M+ artists)
docker-compose exec db psql -U musicbrainz -d musicbrainz_db -c "SELECT COUNT(*) FROM artist;"
# Test cleaner connection
docker-compose run --rm musicbrainz-cleaner python3 -c "from src.api.database import MusicBrainzDatabase; db = MusicBrainzDatabase(); print('Connection result:', db.connect())"
```
### 4. Run Tests
```bash
# Test 100 random songs
docker-compose run --rm musicbrainz-cleaner python3 test_100_random.py
# Or other test scripts
docker-compose run --rm musicbrainz-cleaner python3 [script_name].py
```
**⚠️ Important**: Always run scripts via Docker - the cleaner cannot connect to the database directly from outside the container.
**📋 Troubleshooting**: See `TROUBLESHOOTING.md` for common issues and solutions.
## ✨ What's New in v3.0
- **🚀 Direct Database Access**: Connect directly to PostgreSQL for 10x faster performance
@ -115,6 +158,82 @@ cd ../musicbrainz-docker && MUSICBRAINZ_WEB_SERVER_PORT=5001 docker-compose up -
**Note**: Your data is preserved in Docker volumes, so you don't need to reconfigure anything after a reboot.
## 🚨 Common Startup Issues & Fixes
### Issue 1: Database Connection Refused
**Problem**: Cleaner can't connect to database with error "Connection refused"
**Root Cause**: Database container not fully initialized or wrong host configuration
**Fix**:
```bash
# Wait for database to be ready (check logs)
cd ../musicbrainz-docker && docker-compose logs db | tail -10
# Verify database is accepting connections
docker-compose exec db psql -U musicbrainz -d musicbrainz_db -c "SELECT COUNT(*) FROM artist;"
```
### Issue 2: Wrong Database Host Configuration
**Problem**: Cleaner tries to connect to `172.18.0.2` but can't reach it
**Root Cause**: Hardcoded IP address in database connection
**Fix**: Use Docker service name `db` instead of IP address
```python
# In src/api/database.py, change:
host='172.18.0.2' # ❌ Wrong
host='db' # ✅ Correct
```
### Issue 3: Test Script Logic Error
**Problem**: Test shows 0% success rate despite finding artists
**Root Cause**: Test script checking `'mbid' in result` where `result` is a tuple `(song_dict, success_boolean)`
**Fix**: Extract song dictionary from tuple
```python
# Wrong:
artist_found = 'mbid' in result
# Correct:
cleaned_song, success = result
artist_found = 'mbid' in cleaned_song
```
### Issue 4: Services Not Fully Initialized
**Problem**: API returns empty results even though database has data
**Root Cause**: MusicBrainz web server still starting up
**Fix**: Wait for services to be fully ready
```bash
# Check if web server is responding
curl -s http://localhost:5001 | head -5
# Wait for database to be ready
docker-compose logs db | grep "database system is ready"
```
### Issue 5: Port Conflicts
**Problem**: Port 5000 already in use
**Root Cause**: Another service using the port
**Fix**: Use alternative port
```bash
MUSICBRAINZ_WEB_SERVER_PORT=5001 docker-compose up -d
```
### Issue 6: Container Name Conflicts
**Problem**: "Container name already in use" error
**Root Cause**: Previous containers not properly cleaned up
**Fix**: Remove conflicting containers
```bash
docker-compose down
docker rm -f <container_name>
```
## 🔧 Startup Checklist
Before running tests, verify:
1. ✅ Docker Desktop is running
2. ✅ All containers are up: `docker-compose ps`
3. ✅ Database is ready: `docker-compose logs db | grep "ready"`
4. ✅ Web server responds: `curl -s http://localhost:5001`
5. ✅ Database has data: `docker-compose exec db psql -U musicbrainz -d musicbrainz_db -c "SELECT COUNT(*) FROM artist;"`
6. ✅ Cleaner can connect: Test database connection in cleaner
## 📋 Requirements
- **Python 3.6+**

214
TROUBLESHOOTING.md Normal file
View File

@ -0,0 +1,214 @@
# 🚨 MusicBrainz Cleaner Troubleshooting Guide
This guide documents common issues encountered when starting and running the MusicBrainz cleaner, along with their solutions.
## 📋 Key Files for New Sessions
When starting a new chat session, reference these files in order:
1. **`README.md`** - Quick start guide and basic usage
2. **`PRD.md`** - Technical specifications and requirements
3. **`SETUP.md`** - Detailed setup instructions
4. **`TROUBLESHOOTING.md`** - This file - common issues and solutions
5. **`start_services.sh`** - Automated service startup script
6. **`restart_services.sh`** - Quick restart script for after reboots
## Quick Diagnostic Commands
```bash
# Check if Docker is running
docker --version
# Check container status
cd ../musicbrainz-docker && docker-compose ps
# Check database logs
docker-compose logs db | tail -10
# Check web server logs
docker-compose logs musicbrainz | tail -10
# Test web server response
curl -s http://localhost:5001 | head -5
# Test database connection
docker-compose exec db psql -U musicbrainz -d musicbrainz_db -c "SELECT COUNT(*) FROM artist;"
# Test cleaner connection
cd ../musicbrainz-cleaner && docker-compose run --rm musicbrainz-cleaner python3 -c "from src.api.database import MusicBrainzDatabase; db = MusicBrainzDatabase(); print('Connection result:', db.connect())"
```
## Common Issues & Solutions
### 🚫 Issue 1: Database Connection Refused
**Error Message:**
```
Connection error: connection to server at "172.18.0.2", port 5432 failed: Connection refused
```
**Root Cause:** Database container not fully initialized or wrong host configuration
**Solutions:**
1. **Wait for database initialization:**
```bash
cd ../musicbrainz-docker
docker-compose logs db | grep "database system is ready"
```
2. **Fix host configuration in database.py:**
```python
# Change this line in src/api/database.py:
host='172.18.0.2' # ❌ Wrong
host='db' # ✅ Correct
```
3. **Verify database is ready:**
```bash
docker-compose exec db psql -U musicbrainz -d musicbrainz_db -c "SELECT COUNT(*) FROM artist;"
```
### 🚫 Issue 2: Test Shows 0% Success Rate
**Symptoms:** Test script reports 0% success despite finding artists in logs
**Root Cause:** Test script logic error - checking `'mbid' in result` where `result` is a tuple
**Solution:** Fix test script to extract song dictionary from tuple:
```python
# Wrong:
artist_found = 'mbid' in result
# Correct:
cleaned_song, success = result
artist_found = 'mbid' in cleaned_song
```
### 🚫 Issue 3: Port Already in Use
**Error Message:**
```
ports are not available: exposing port TCP 0.0.0.0:5000 ... bind: address already in use
```
**Solution:**
```bash
# Kill process using port 5000
lsof -ti:5000 | xargs kill -9
# Or use alternative port
MUSICBRAINZ_WEB_SERVER_PORT=5001 docker-compose up -d
```
### 🚫 Issue 4: Container Name Conflicts
**Error Message:**
```
Conflict. The container name ... is already in use
```
**Solution:**
```bash
# Stop and remove existing containers
docker-compose down
# Force remove specific container if needed
docker rm -f <container_name>
# Restart services
docker-compose up -d
```
### 🚫 Issue 5: Docker Not Running
**Error Message:**
```
Cannot connect to the Docker daemon
```
**Solution:**
```bash
# Start Docker Desktop
open -a Docker
# Wait for Docker to start, then restart services
./restart_services.sh
```
### 🚫 Issue 6: API Returns Empty Results
**Symptoms:** API calls return empty results even though database has data
**Root Cause:** MusicBrainz web server not fully initialized
**Solution:**
```bash
# Wait for web server to be ready
sleep 60
# Test API response
curl -s "http://localhost:5001/ws/2/artist/?query=name:The%20Beatles&fmt=json"
```
## Startup Checklist
Before running any tests, verify:
1. ✅ **Docker Desktop is running**
2. ✅ **All containers are up:** `docker-compose ps`
3. ✅ **Database is ready:** `docker-compose logs db | grep "ready"`
4. ✅ **Web server responds:** `curl -s http://localhost:5001`
5. ✅ **Database has data:** Check artist count in database
6. ✅ **Cleaner can connect:** Test database connection in cleaner
## Performance Issues
### Slow Processing
- **Cause:** Database queries taking too long
- **Solution:** Ensure database has proper indexes and is fully loaded
### Memory Issues
- **Cause:** Docker Desktop memory allocation too low
- **Solution:** Increase Docker Desktop memory to 8GB+
### Platform Warnings
- **Cause:** Apple Silicon (M1/M2) platform mismatch
- **Solution:** These warnings can be ignored - services work correctly
## Recovery Procedures
### Complete Reset
```bash
# Stop all services
cd ../musicbrainz-docker && docker-compose down
# Remove all containers and volumes (⚠️ WARNING: This deletes data)
docker-compose down -v
# Restart from scratch
./start_services.sh
```
### Quick Restart
```bash
# Quick restart (preserves data)
./restart_services.sh
```
## Getting Help
If you encounter issues not covered in this guide:
1. Check the logs: `docker-compose logs -f`
2. Verify system requirements are met
3. Try the complete reset procedure
4. Check the main README.md for additional troubleshooting steps
## Prevention Tips
1. **Always use the startup scripts** (`start_services.sh` or `restart_services.sh`)
2. **Wait for services to fully initialize** before running tests
3. **Use the startup checklist** before running any tests
4. **Keep Docker Desktop memory allocation** at 8GB or higher
5. **Use port 5001** if port 5000 is busy

View File

@ -192819,7 +192819,7 @@
"genre": "Karaoke",
"guid": "08c840ce-8b80-4856-6132-8d7bf9a357e9",
"path": "z://MP4\\Sing King Karaoke\\Yebba - My Mind Karaoke Version).mp4",
"title": "My Mind Karaoke Version)"
"title": "My Mind (Karaoke Version)"
},
{
"artist": "Yeng Constantino",
@ -216756,13 +216756,13 @@
"title": "The More You Live, The More You Love"
},
{
"artist": "A Flock Of Seagulls- I Ran",
"artist": "A Flock Of Seagulls",
"disabled": false,
"favorite": false,
"genre": "Karaoke",
"guid": "27214162-6aa8-13e8-9721-6f08e7956084",
"path": "z://MP4\\ZoomKaraokeOfficial\\A Flock Of Seagulls- I Ran - Karaoke Version from Zoom Karaoke.mp4",
"title": "Karaoke Version from Zoom Karaoke"
"title": "I Ran"
},
{
"artist": "A Goofy Movie",
@ -219192,7 +219192,7 @@
"genre": "Karaoke",
"guid": "b5b380d6-6699-d1b6-095b-e7721f553838",
"path": "z://MP4\\ZoomKaraokeOfficial\\Annie Soundtrack - Tomorrow Karaoke Version from Zoom Karaoke (1982 Version).mp4",
"title": "Tomorrow - Karaoke Version from Zoom Karaoke (1982 Version)"
"title": "Tomorrow"
},
{
"artist": "Another Level",
@ -220407,7 +220407,7 @@
"genre": "Karaoke",
"guid": "892e330d-aeb9-195d-e2ac-1d2ed44b00f2",
"path": "z://MP4\\ZoomKaraokeOfficial\\Bananarama - Venus Karaoke Version from Zoom Karaoke (Lyric Fixed).mp4",
"title": "Venus - Karaoke Version from Zoom Karaoke (Lyric Fixed)"
"title": "Venus (Lyric Fixed)"
},
{
"artist": "Bananarama",
@ -220416,7 +220416,7 @@
"genre": "Karaoke",
"guid": "3cda0061-edc5-71eb-b1ce-a592de40fed8",
"path": "z://MP4\\ZoomKaraokeOfficial\\Bananarama - Venus Karaoke Version from Zoom Karaoke (Old Version).mp4",
"title": "Venus - Karaoke Version from Zoom Karaoke (Old Version)"
"title": "Venus (Old Version)"
},
{
"artist": "Band Aid 30",
@ -227112,7 +227112,7 @@
"genre": "Karaoke",
"guid": "89e2f35a-d52a-54f7-da46-8a590ee38f68",
"path": "z://MP4\\ZoomKaraokeOfficial\\Charli XCX - Speed Drive Karaoke Version from Zoom Karaoke (Barbie Movie).mp4",
"title": "Speed Drive - Karaoke Version from Zoom Karaoke (Barbie Movie)"
"title": "Speed Drive (Barbie Movie)"
},
{
"artist": "Charli XCX ft. Ariana Grande",
@ -230082,7 +230082,7 @@
"genre": "Karaoke",
"guid": "0173e449-92bc-8c75-7054-105227a56c19",
"path": "z://MP4\\ZoomKaraokeOfficial\\Darlene Love - All Alone On Christmas Karaoke Version from Zoom Karaoke (from Home Alone).mp4",
"title": "All Alone On Christmas - Karaoke Version from Zoom Karaoke (from 'Home Alone')"
"title": "All Alone On Christmas (from 'Home Alone')"
},
{
"artist": "Darts",
@ -230217,7 +230217,7 @@
"genre": "Karaoke",
"guid": "a48c4cc6-7a32-57e7-48ce-14ff7692b8d1",
"path": "z://MP4\\ZoomKaraokeOfficial\\Dave Edmunds - From Small Things (Big Things One Day Come) Karaoke Version from Zoom Karaoke.mp4",
"title": "From Small Things (Big Things One Day Come) - Karaoke Version from Zoom Karaoke."
"title": "From Small Things (Big Things One Day Come)."
},
{
"artist": "Dave Edmunds",
@ -239876,15 +239876,6 @@
"path": "z://MP4\\ZoomKaraokeOfficial\\Free - Wishing Well.mp4",
"title": "Wishing Well"
},
{
"artist": "Fremantle Dockers Theme Song",
"disabled": false,
"favorite": false,
"genre": "Karaoke",
"guid": "156939ea-5add-7103-a46f-e2ccac419d80",
"path": "z://MP4\\ZoomKaraokeOfficial\\Fremantle Dockers Theme Song - Karaoke Version from Zoom Karaoke Australian Football League.mp4",
"title": "Karaoke Version from Zoom Karaoke - Australian Football League"
},
{
"artist": "Freya Ridings",
"disabled": false,
@ -284426,7 +284417,7 @@
"title": "You're No Good"
},
{
"artist": "The Tamperer ft. Maya",
"artist": "The Tamperer featuring Maya",
"disabled": false,
"favorite": false,
"genre": "Karaoke",
@ -284435,7 +284426,7 @@
"title": "Feel It"
},
{
"artist": "The Tamperer ft. Maya",
"artist": "The Tamperer featuring Maya",
"disabled": false,
"favorite": false,
"genre": "Karaoke",
@ -306813,7 +306804,7 @@
"title": "Metal Postcard"
},
{
"artist": "Lavato, Demi & Joe Jonas",
"artist": "Lovato, Demi & Joe Jonas",
"disabled": false,
"favorite": false,
"genre": "Karaoke",
@ -308682,7 +308673,7 @@
"title": "I Run To You"
},
{
"artist": "Lavato, Demi & Joe Jonas",
"artist": "Lovato, Demi & Joe Jonas",
"disabled": false,
"favorite": false,
"genre": "Karaoke",

View File

@ -34,7 +34,7 @@ class MusicBrainzDatabase:
try:
# Use the direct connection method that works
self.connection = psycopg2.connect(
host='172.18.0.2', # Docker container IP that works
host='db', # Use Docker service name
port=self.port,
database=self.database,
user=self.user,