diff --git a/COMMANDS.md b/COMMANDS.md index 5cd1ecf..121b1b8 100644 --- a/COMMANDS.md +++ b/COMMANDS.md @@ -2,7 +2,7 @@ ## Overview -The MusicBrainz Data Cleaner is a command-line interface (CLI) tool that processes JSON song data files and cleans/normalizes the metadata using the MusicBrainz database. The tool creates separate output files for successful and failed songs, along with detailed processing reports. +The MusicBrainz Data Cleaner is a command-line interface (CLI) tool that processes JSON song data files and cleans/normalizes the metadata using the MusicBrainz database. The tool uses an interface-based architecture with dependency injection for clean, maintainable code. It creates separate output files for successful and failed songs, along with detailed processing reports. ## Basic Command Structure @@ -277,6 +277,24 @@ Error: Source file should contain a JSON array of songs ModuleNotFoundError: No module named 'requests' ``` +## Architecture Overview + +### Interface-Based Design + +The tool uses a clean interface-based architecture: + +- **`MusicBrainzDataProvider` Interface**: Common protocol for data access +- **`DataProviderFactory`**: Creates appropriate provider (database or API) +- **`SongProcessor`**: Centralized processing logic using the interface +- **Dependency Injection**: CLI depends on interfaces, not concrete classes + +### Data Flow + +1. **CLI** uses `DataProviderFactory` to create data provider +2. **Factory** returns either database or API implementation +3. **SongProcessor** processes songs using the common interface +4. **Same logic** works regardless of provider type + ## Environment Configuration ### Docker Environment diff --git a/PRD.md b/PRD.md index de6f5c1..260304a 100644 --- a/PRD.md +++ b/PRD.md @@ -162,6 +162,7 @@ Users have song data in JSON format with inconsistent artist names, song titles, - **Primary:** Direct PostgreSQL database access - **Fallback:** MusicBrainz REST API (local server) - **Interface:** Command-line (CLI) +- **Design Pattern:** Interface-based architecture with dependency injection ### Project Structure ``` @@ -169,20 +170,27 @@ src/ ├── __init__.py # Package initialization ├── api/ # API-related modules │ ├── __init__.py -│ ├── database.py # Direct PostgreSQL access with fuzzy search -│ └── api_client.py # Legacy HTTP API client (fallback) +│ ├── database.py # Direct PostgreSQL access (implements MusicBrainzDataProvider) +│ └── api_client.py # HTTP API client (implements MusicBrainzDataProvider) ├── cli/ # Command-line interface │ ├── __init__.py -│ └── main.py # Main CLI implementation +│ └── main.py # Main CLI implementation (uses factory pattern) ├── config/ # Configuration │ ├── __init__.py │ └── constants.py # Constants and settings ├── core/ # Core functionality +│ ├── __init__.py +│ ├── interfaces.py # Common interfaces and protocols +│ ├── factory.py # Data provider factory +│ └── song_processor.py # Centralized song processing logic ├── tests/ # Test files and scripts │ ├── __init__.py │ ├── test_*.py # Unit and integration tests │ └── debug_*.py # Debug scripts └── utils/ # Utility functions + ├── __init__.py + ├── artist_title_processing.py # Shared artist/title processing + └── data_loader.py # Data loading utilities ``` ### Architectural Principles @@ -193,13 +201,18 @@ src/ - **Error Handling**: Graceful error handling with meaningful messages - **Performance First**: Direct database access for maximum speed - **Fallback Strategy**: Automatic fallback to API when database unavailable -- **NEW**: **Database-First**: Always use live database data over static caches -- **NEW**: **Intelligent Collaboration Detection**: Distinguish band names from collaborations -- **NEW**: **Test Organization**: All test files must be placed in `src/tests/` directory, not in root +- **Interface-Based Design**: Uses dependency injection with common interfaces +- **Factory Pattern**: Clean provider creation and configuration +- **Single Responsibility**: Each class has one clear purpose +- **Database-First**: Always use live database data over static caches +- **Intelligent Collaboration Detection**: Distinguish band names from collaborations +- **Test Organization**: All test files must be placed in `src/tests/` directory, not in root ### Data Flow -1. Read JSON input file -2. For each song: +1. **CLI** uses `DataProviderFactory` to create appropriate data provider (database or API) +2. **SongProcessor** receives the data provider and processes songs using the common interface +3. **Data Provider** (database or API) implements the same interface for consistent behavior +4. For each song: - Clean artist name using name variations - Detect collaboration patterns - Use fuzzy search to find artist in database (including aliases, sort_names) @@ -207,7 +220,7 @@ src/ - For collaborations: find artist_credit and recording - For single artists: find recording by artist and title - Update song object with corrected data and MBIDs -3. Write cleaned data to output file +5. Write cleaned data to output file ### Fuzzy Search Implementation - **Algorithm**: Uses fuzzywuzzy library with multiple matching strategies @@ -316,18 +329,23 @@ docker-compose logs -f musicbrainz - [x] Fuzzy search for artists and recordings - [x] Automatic fallback to API mode - [x] Performance optimizations -- [x] **NEW**: Advanced collaboration detection and handling -- [x] **NEW**: Artist alias and sort_name search -- [x] **NEW**: Dash variation handling -- [x] **NEW**: Numerical suffix handling -- [x] **NEW**: Band name vs collaboration distinction -- [x] **NEW**: Complex collaboration parsing -- [x] **NEW**: Removed problematic known_artists cache -- [x] **NEW**: Simplified CLI with default full dataset processing -- [x] **NEW**: Separate output files for successful and failed songs (array format) -- [x] **NEW**: Human-readable text reports with statistics -- [x] **NEW**: Smart defaults for all file paths and options -- [x] **NEW**: Configurable processing limits and output file paths +- [x] Advanced collaboration detection and handling +- [x] Artist alias and sort_name search +- [x] Dash variation handling +- [x] Numerical suffix handling +- [x] Band name vs collaboration distinction +- [x] Complex collaboration parsing +- [x] Removed problematic known_artists cache +- [x] Simplified CLI with default full dataset processing +- [x] Separate output files for successful and failed songs (array format) +- [x] Human-readable text reports with statistics +- [x] Smart defaults for all file paths and options +- [x] Configurable processing limits and output file paths +- [x] **NEW**: Interface-based architecture with dependency injection +- [x] **NEW**: Factory pattern for data provider creation +- [x] **NEW**: Centralized song processing logic +- [x] **NEW**: Common interfaces for database and API clients +- [x] **NEW**: Clean separation of concerns ### 🔄 Future Enhancements - [ ] Web interface option diff --git a/README.md b/README.md index 1d90f6a..d27cb91 100644 --- a/README.md +++ b/README.md @@ -1,6 +1,6 @@ # 🎵 MusicBrainz Data Cleaner v3.0 -A powerful command-line tool that cleans and normalizes your song data using the MusicBrainz database. **Now with advanced collaboration detection, artist alias handling, and intelligent fuzzy search for maximum accuracy!** +A powerful command-line tool that cleans and normalizes your song data using the MusicBrainz database. **Now with interface-based architecture, advanced collaboration detection, artist alias handling, and intelligent fuzzy search for maximum accuracy!** ## 🚀 Quick Start for New Sessions @@ -47,6 +47,8 @@ docker-compose run --rm musicbrainz-cleaner python3 [script_name].py ## ✨ What's New in v3.0 +- **🏗️ Interface-Based Architecture**: Clean dependency injection with common interfaces +- **🏭 Factory Pattern**: Smart data provider creation and configuration - **🚀 Direct Database Access**: Connect directly to PostgreSQL for 10x faster performance - **🎯 Advanced Fuzzy Search**: Intelligent matching for similar artist names and song titles - **🔄 Automatic Fallback**: Falls back to API mode if database access fails @@ -299,11 +301,19 @@ python3 src/tests/run_tests.py --list musicbrainz-cleaner/ ├── src/ │ ├── api/ # Database and API access +│ │ ├── database.py # Direct PostgreSQL access (implements MusicBrainzDataProvider) +│ │ └── api_client.py # HTTP API client (implements MusicBrainzDataProvider) │ ├── cli/ # Command-line interface +│ │ └── main.py # Main CLI implementation (uses factory pattern) │ ├── config/ # Configuration and constants │ ├── core/ # Core functionality +│ │ ├── interfaces.py # Common interfaces and protocols +│ │ ├── factory.py # Data provider factory +│ │ └── song_processor.py # Centralized song processing logic │ ├── tests/ # Test files (REQUIRED location) │ └── utils/ # Utility functions +│ ├── artist_title_processing.py # Shared artist/title processing +│ └── data_loader.py # Data loading utilities ├── data/ # Data files and output │ ├── known_artists.json # Name variations (ACDC → AC/DC) │ ├── known_recordings.json # Known recording MBIDs diff --git a/src/api/api_client.py b/src/api/api_client.py index 1f20ccb..e39298c 100644 --- a/src/api/api_client.py +++ b/src/api/api_client.py @@ -1,19 +1,20 @@ """ -Legacy HTTP API client for MusicBrainz Data Cleaner. -Used as fallback when direct database access is not available. +HTTP API client for MusicBrainz Data Cleaner. +Implements the MusicBrainzDataProvider interface for API-based access. """ import requests import time -from typing import Dict, Optional, Any +from typing import Dict, Optional, Any, Tuple from ..config.constants import ( DEFAULT_MUSICBRAINZ_URL, API_REQUEST_DELAY, REQUEST_TIMEOUT, SUCCESS_MESSAGES, ERROR_MESSAGES ) +from ..core.interfaces import MusicBrainzDataProvider -class MusicBrainzAPIClient: - """Legacy HTTP API client for MusicBrainz (fallback option).""" +class MusicBrainzAPIClient(MusicBrainzDataProvider): + """HTTP API client for MusicBrainz implementing the data provider interface.""" def __init__(self, base_url: str = DEFAULT_MUSICBRAINZ_URL): self.base_url = base_url @@ -30,8 +31,8 @@ class MusicBrainzAPIClient: print(f"API connection test failed: {e}") return False - def search_artist(self, artist_name: str) -> Optional[Dict[str, Any]]: - """Search for artist by name using API.""" + def fuzzy_search_artist(self, artist_name: str) -> Optional[Tuple[str, str, float]]: + """Search for artist by name using API. Returns (artist_name, mbid, score).""" try: url = f"{self.base_url}/ws/2/artist/?query=name:{artist_name}&fmt=json" response = self.session.get(url) @@ -39,14 +40,15 @@ class MusicBrainzAPIClient: data = response.json() if data.get('artists') and len(data['artists']) > 0: - return data['artists'][0] + artist = data['artists'][0] + return (artist['name'], artist['id'], 1.0) # Perfect match for API return None except Exception as e: print(f"API search failed for artist '{artist_name}': {e}") return None - def search_recording(self, title: str, artist_mbid: Optional[str] = None) -> Optional[Dict[str, Any]]: - """Search for recording by title and optionally artist using API.""" + def fuzzy_search_recording(self, title: str, artist_mbid: Optional[str] = None) -> Optional[Tuple[str, str, float]]: + """Search for recording by title and optionally artist using API. Returns (recording_name, mbid, score).""" try: if artist_mbid: url = f"{self.base_url}/ws/2/recording/?query=arid:{artist_mbid}%20AND%20name:{title}&fmt=json" @@ -58,7 +60,8 @@ class MusicBrainzAPIClient: data = response.json() if data.get('recordings') and len(data['recordings']) > 0: - return data['recordings'][0] + recording = data['recordings'][0] + return (recording['title'], recording['id'], 1.0) # Perfect match for API return None except Exception as e: print(f"API search failed for recording '{title}': {e}") diff --git a/src/api/database.py b/src/api/database.py index 911f0ba..d34d43c 100644 --- a/src/api/database.py +++ b/src/api/database.py @@ -13,10 +13,11 @@ from ..config.constants import ( TITLE_SIMILARITY_THRESHOLD, ARTIST_SIMILARITY_THRESHOLD, SUCCESS_MESSAGES, ERROR_MESSAGES ) -from src.utils.artist_title_processing import parse_complex_collaboration, parse_collaborators, generate_title_variations +from ..utils.artist_title_processing import parse_complex_collaboration, parse_collaborators, generate_title_variations +from ..core.interfaces import MusicBrainzDataProvider -class MusicBrainzDatabase: +class MusicBrainzDatabase(MusicBrainzDataProvider): """Direct PostgreSQL database access for MusicBrainz with fuzzy search.""" def __init__(self, host: str = DB_HOST, port: int = DB_PORT, diff --git a/src/cli/main.py b/src/cli/main.py index 13c861b..9b3e012 100644 --- a/src/cli/main.py +++ b/src/cli/main.py @@ -23,35 +23,23 @@ from ..config.constants import ( -# Import database and API clients -from ..api.database import MusicBrainzDatabase -from ..api.api_client import MusicBrainzAPIClient +# Import core components from ..core.song_processor import SongProcessor +from ..core.factory import DataProviderFactory class MusicBrainzCleaner: - """Enhanced MusicBrainz Cleaner with direct database access and fuzzy search.""" + """Enhanced MusicBrainz Cleaner with interface-based data access.""" def __init__(self, use_database: bool = True, base_url: str = DEFAULT_MUSICBRAINZ_URL): self.use_database = use_database self.base_url = base_url - # Initialize database connection (primary method) - if use_database: - self.db = MusicBrainzDatabase() - if not self.db.connect(): - print("⚠️ Database connection failed, falling back to API") - self.use_database = False + # Create data provider using factory + self.data_provider = DataProviderFactory.create_provider(use_database, base_url) - # Initialize API client (fallback method) - if not self.use_database: - self.api = MusicBrainzAPIClient(base_url) - - # Initialize centralized song processor - self.song_processor = SongProcessor( - database_client=self.db if use_database else None, - api_client=self.api if not use_database else None - ) + # Initialize centralized song processor with the data provider + self.song_processor = SongProcessor(self.data_provider) def clean_song(self, song: Dict[str, Any]) -> Tuple[Dict[str, Any], bool]: """ @@ -394,22 +382,15 @@ def main() -> int: # Handle test connection if parsed['test_connection']: - if parsed['use_api']: - api = MusicBrainzAPIClient() - if api.test_connection(): - print("✅ Connection to MusicBrainz API server successful") - return ExitCode.SUCCESS - else: - print("❌ Connection to MusicBrainz API server failed") - return ExitCode.ERROR + provider = DataProviderFactory.create_provider(not parsed['use_api']) + if provider.test_connection(): + provider_type = "API server" if parsed['use_api'] else "database" + print(f"✅ Connection to MusicBrainz {provider_type} successful") + return ExitCode.SUCCESS else: - db = MusicBrainzDatabase() - if db.test_connection(): - print("✅ Connection to MusicBrainz database successful") - return ExitCode.SUCCESS - else: - print("❌ Connection to MusicBrainz database failed") - return ExitCode.ERROR + provider_type = "API server" if parsed['use_api'] else "database" + print(f"❌ Connection to MusicBrainz {provider_type} failed") + return ExitCode.ERROR # Process songs (main functionality) source_file = Path(parsed['source']) diff --git a/src/core/factory.py b/src/core/factory.py new file mode 100644 index 0000000..e8b8156 --- /dev/null +++ b/src/core/factory.py @@ -0,0 +1,36 @@ +""" +Factory for creating MusicBrainz data providers. +Provides a clean way to instantiate the appropriate data provider based on configuration. +""" + +from typing import Optional +from ..api.database import MusicBrainzDatabase +from ..api.api_client import MusicBrainzAPIClient +from .interfaces import MusicBrainzDataProvider +from ..config.constants import DEFAULT_MUSICBRAINZ_URL + + +class DataProviderFactory: + """Factory for creating MusicBrainz data providers.""" + + @staticmethod + def create_provider(use_database: bool = True, base_url: str = DEFAULT_MUSICBRAINZ_URL) -> MusicBrainzDataProvider: + """ + Create a data provider based on configuration. + + Args: + use_database: Whether to use database (True) or API (False) + base_url: Base URL for API client (only used if use_database=False) + + Returns: + MusicBrainzDataProvider instance + """ + if use_database: + provider = MusicBrainzDatabase() + if provider.connect(): + return provider + else: + print("⚠️ Database connection failed, falling back to API") + return MusicBrainzAPIClient(base_url) + else: + return MusicBrainzAPIClient(base_url) \ No newline at end of file diff --git a/src/core/interfaces.py b/src/core/interfaces.py new file mode 100644 index 0000000..d265f03 --- /dev/null +++ b/src/core/interfaces.py @@ -0,0 +1,60 @@ +""" +Core interfaces for MusicBrainz Data Cleaner. +Defines the common protocol that all data access implementations must follow. +""" + +from abc import ABC, abstractmethod +from typing import Dict, Optional, Any, Tuple, List + + +class MusicBrainzDataProvider(ABC): + """ + Abstract base class defining the interface for MusicBrainz data access. + Both database and API implementations must implement these methods. + """ + + @abstractmethod + def test_connection(self) -> bool: + """Test connection to the data source.""" + pass + + @abstractmethod + def fuzzy_search_artist(self, artist_name: str) -> Optional[Tuple[str, str, float]]: + """ + Fuzzy search for artist by name. + Returns (artist_name, mbid, similarity_score) or None. + """ + pass + + @abstractmethod + def fuzzy_search_recording(self, title: str, artist_mbid: Optional[str] = None) -> Optional[Tuple[str, str, float]]: + """ + Fuzzy search for recording by title and optionally artist. + Returns (recording_name, mbid, similarity_score) or None. + """ + pass + + @abstractmethod + def get_artist_info(self, mbid: str) -> Optional[Dict[str, Any]]: + """Get detailed artist information by MBID.""" + pass + + @abstractmethod + def get_recording_info(self, mbid: str) -> Optional[Dict[str, Any]]: + """Get detailed recording information by MBID.""" + pass + + +class SongProcessorInterface(ABC): + """ + Abstract base class defining the interface for song processing. + This ensures consistent behavior across different implementations. + """ + + @abstractmethod + def clean_song(self, song: Dict[str, Any]) -> Tuple[Dict[str, Any], bool]: + """ + Clean a single song. + Returns (cleaned_song, success_status) + """ + pass \ No newline at end of file diff --git a/src/core/song_processor.py b/src/core/song_processor.py index 44f47a0..763539a 100644 --- a/src/core/song_processor.py +++ b/src/core/song_processor.py @@ -7,22 +7,20 @@ between CLI and database interactions to ensure consistency. from typing import Dict, Optional, Tuple, Any, List from ..utils.artist_title_processing import parse_complex_collaboration, generate_title_variations from ..utils.data_loader import data_loader +from .interfaces import MusicBrainzDataProvider, SongProcessorInterface -class SongProcessor: +class SongProcessor(SongProcessorInterface): """Centralized song processing with consistent logic across all interfaces.""" - def __init__(self, database_client=None, api_client=None): + def __init__(self, data_provider: MusicBrainzDataProvider): """ - Initialize with optional database and API clients. + Initialize with a data provider that implements MusicBrainzDataProvider. Args: - database_client: MusicBrainzDatabase instance for direct DB access - api_client: MusicBrainzAPIClient instance for API fallback + data_provider: Instance implementing MusicBrainzDataProvider interface """ - self.db = database_client - self.api = api_client - self.use_database = database_client is not None + self.data_provider = data_provider def find_artist_mbid(self, artist_name: str) -> Optional[str]: """ @@ -37,20 +35,9 @@ class SongProcessor: if not artist_name: return None - # Try database first if available - if self.use_database: - result = self.db.fuzzy_search_artist(artist_name) - if result and isinstance(result, tuple) and len(result) >= 2: - return result[1] # Return MBID from tuple (artist_name, mbid, score) - - # Fallback to API - if self.api: - try: - result = self.api.search_artist(artist_name) - if result: - return result['id'] - except: - pass + result = self.data_provider.fuzzy_search_artist(artist_name) + if result and isinstance(result, tuple) and len(result) >= 2: + return result[1] # Return MBID from tuple (artist_name, mbid, score) return None @@ -83,70 +70,52 @@ class SongProcessor: return recording_mbid # Handle collaborations using artist credit - if self.use_database: - # If no artist_mbid (collaboration case), try to find by title and verify artist credit - if not artist_mbid and original_artist: - # This is a collaboration case, try to find by title with all variations - for variation in title_variations: - result = self.db.fuzzy_search_recording(variation) - if result and isinstance(result, tuple) and len(result) >= 2: - recording_mbid = result[1] - - # Verify that this recording has the correct artist credit - recording_info = self.get_recording_info(recording_mbid) - if recording_info and recording_info.get('artist_credit'): - # Check if the artist credit matches our expected collaboration - expected_artist_string = original_artist.replace(',', ' & ').replace(' and ', ' & ') - if recording_info['artist_credit'].lower() == expected_artist_string.lower(): - return recording_mbid - - # If exact match fails, try partial match - if recording_info and recording_info.get('artist_credit'): - # Check if all artists in the collaboration are present in the recording - main_artist, collaborators = parse_complex_collaboration(original_artist) - recording_artists = recording_info['artist_credit'].lower() - - # Check if main artist is in the recording - if main_artist.lower() in recording_artists: - # Check if at least one collaborator is also present - for collaborator in collaborators: - if collaborator.lower() in recording_artists: - return recording_mbid - - return None - else: - # Regular case with artist_mbid - try all title variations - for variation in title_variations: - result = self.db.fuzzy_search_recording(variation, artist_mbid) - if result and isinstance(result, tuple) and len(result) >= 2: - return result[1] # Return MBID from tuple (recording_name, mbid, score) - else: - # Fallback to API - try all title variations + # If no artist_mbid (collaboration case), try to find by title and verify artist credit + if not artist_mbid and original_artist: + # This is a collaboration case, try to find by title with all variations for variation in title_variations: - try: - result = self.api.search_recording(variation, artist_mbid) - if result: - return result['id'] - except: - pass + result = self.data_provider.fuzzy_search_recording(variation) + if result and isinstance(result, tuple) and len(result) >= 2: + recording_mbid = result[1] + + # Verify that this recording has the correct artist credit + recording_info = self.get_recording_info(recording_mbid) + if recording_info and recording_info.get('artist_credit'): + # Check if the artist credit matches our expected collaboration + expected_artist_string = original_artist.replace(',', ' & ').replace(' and ', ' & ') + if recording_info['artist_credit'].lower() == expected_artist_string.lower(): + return recording_mbid + + # If exact match fails, try partial match + if recording_info and recording_info.get('artist_credit'): + # Check if all artists in the collaboration are present in the recording + main_artist, collaborators = parse_complex_collaboration(original_artist) + recording_artists = recording_info['artist_credit'].lower() + + # Check if main artist is in the recording + if main_artist.lower() in recording_artists: + # Check if at least one collaborator is also present + for collaborator in collaborators: + if collaborator.lower() in recording_artists: + return recording_mbid + + # Regular case with artist_mbid - try all title variations + for variation in title_variations: + result = self.data_provider.fuzzy_search_recording(variation, artist_mbid) + if result and isinstance(result, tuple) and len(result) >= 2: + return result[1] # Return MBID from tuple (recording_name, mbid, score) + + return None return None def get_artist_info(self, mbid: str) -> Optional[Dict[str, Any]]: """Get artist info using consistent logic.""" - if self.use_database: - return self.db.get_artist_info(mbid) - elif self.api: - return self.api.get_artist_info(mbid) - return None + return self.data_provider.get_artist_info(mbid) def get_recording_info(self, mbid: str) -> Optional[Dict[str, Any]]: """Get recording info using consistent logic.""" - if self.use_database: - return self.db.get_recording_info(mbid) - elif self.api: - return self.api.get_recording_info(mbid) - return None + return self.data_provider.get_recording_info(mbid) def _build_artist_string(self, artist_credit: list) -> str: """Build artist string from artist credit list (API format).""" @@ -189,9 +158,9 @@ class SongProcessor: recording_info = self.get_recording_info(recording_mbid) if recording_info: # Update with the correct artist credit and title - if self.use_database and recording_info.get('artist_credit'): + if recording_info.get('artist_credit'): song['artist'] = recording_info['artist_credit'] - elif not self.use_database and recording_info.get('artist-credit'): + elif recording_info.get('artist-credit'): artist_string = self._build_artist_string(recording_info['artist-credit']) if artist_string: song['artist'] = artist_string @@ -200,12 +169,11 @@ class SongProcessor: song['recording_mbid'] = recording_mbid # For collaborations, try to get the main artist's MBID - if self.use_database: - main_artist, collaborators = parse_complex_collaboration(song.get('artist', '')) - if main_artist: - artist_result = self.db.fuzzy_search_artist(main_artist) - if artist_result and isinstance(artist_result, tuple) and len(artist_result) >= 2: - song['mbid'] = artist_result[1] # Set the main artist's MBID + main_artist, collaborators = parse_complex_collaboration(song.get('artist', '')) + if main_artist: + artist_result = self.data_provider.fuzzy_search_artist(main_artist) + if artist_result and isinstance(artist_result, tuple) and len(artist_result) >= 2: + song['mbid'] = artist_result[1] # Set the main artist's MBID return song, True return song, False @@ -229,12 +197,12 @@ class SongProcessor: recording_info = self.get_recording_info(recording_mbid) if recording_info: # Update artist string if there are multiple artists, but preserve the artist MBID - if self.use_database and recording_info.get('artist_credit'): + if recording_info.get('artist_credit'): song['artist'] = recording_info['artist_credit'] # Keep the original artist MBID even when updating artist name if 'mbid' not in song: song['mbid'] = artist_mbid - elif not self.use_database and recording_info.get('artist-credit'): + elif recording_info.get('artist-credit'): artist_string = self._build_artist_string(recording_info['artist-credit']) if artist_string: song['artist'] = artist_string