5.3 KiB
5.3 KiB
MusicBrainz Data Cleaner - Tests
This directory contains all tests for the MusicBrainz Data Cleaner project, organized by type.
📁 Test Structure
src/tests/
├── unit/ # Unit tests for individual components
├── integration/ # Integration tests for database and API
├── debug/ # Debug scripts and troubleshooting tests
├── run_tests.py # Test runner script
├── README.md # This file
├── legacy/ # Legacy scripts moved from root directory
└── moved/ # Test files moved from root directory
Legacy Scripts (Moved from Root)
process_full_dataset.py- Legacy script that redirects to new CLImusicbrainz_cleaner.py- Legacy entry point script
Moved Test Files (Moved from Root)
test_title_cleaning.py- Test title cleaning functionalitytest_simple_query.py- Test simple database queriesdebug_artist_search.py- Debug artist search functionalitytest_failed_collaborations.py- Test failed collaboration casestest_collaboration_debug.py- Debug collaboration parsingtest_100_random.py- Test 100 random songsquick_test_20.py- Quick test with 20 songs
🧪 Test Categories
Unit Tests (unit/)
- Purpose: Test individual components in isolation
- Examples:
test_data_loader.py- Test data loading functionalitytest_collaboration_patterns.py- Test collaboration detectiontest_hyphenated_artists.py- Test artist name variationstest_eazy_e.py- Test specific edge cases
Integration Tests (integration/)
- Purpose: Test interactions between components
- Examples:
test_cli.py- Test command-line interfacedirect_db_test.py- Test database connectivitytest_db_connection.py- Test database queries
Debug Tests (debug/)
- Purpose: Debug scripts and troubleshooting tools
- Examples:
debug_collaboration.py- Debug collaboration parsingsimple_debug.py- Simple debugging utilitiescheck_collaboration.py- Check collaboration handling
🚀 Running Tests
Run All Tests
python3 src/tests/run_tests.py
Running Moved Test Files
The following test files were moved from the root directory to src/tests/:
# Run individual moved test files
python3 src/tests/test_100_random.py
python3 src/tests/quick_test_20.py
python3 src/tests/test_title_cleaning.py
python3 src/tests/test_simple_query.py
python3 src/tests/debug_artist_search.py
python3 src/tests/test_failed_collaborations.py
python3 src/tests/test_collaboration_debug.py
Running Legacy Scripts
Legacy scripts that redirect to the new CLI:
# Legacy full dataset processing (redirects to CLI)
python3 src/tests/process_full_dataset.py
# Legacy entry point (redirects to CLI)
python3 src/tests/musicbrainz_cleaner.py
Note: These legacy scripts are kept for backward compatibility but the new CLI is preferred:
# Preferred method (new CLI)
docker-compose run --rm musicbrainz-cleaner python3 -m src.cli.main
Run Specific Test Categories
# Run only unit tests
python3 src/tests/run_tests.py --unit
# Run only integration tests
python3 src/tests/run_tests.py --integration
Run Specific Test Module
# Run a specific test file
python3 src/tests/run_tests.py test_data_loader
python3 src/tests/run_tests.py test_collaboration_patterns
python3 src/tests/run_tests.py test_cli
List Available Tests
python3 src/tests/run_tests.py --list
📋 Test Data Files
Some tests use JSON data files for testing:
unit/test_aliases.json- Test data for artist aliasesunit/test_sclub7.json- Test data for name variationsunit/test_aliases_cleaned.json- Expected output for alias testsunit/test_sclub7_cleaned.json- Expected output for name variation tests
🔧 Test Requirements
- Database: Some tests require a running MusicBrainz database
- Dependencies: All Python dependencies must be installed
- Environment: Tests should be run from the project root directory
📝 Writing New Tests
Unit Tests
- Place in
unit/directory - Test individual functions or classes
- Use mock data when possible
- Follow naming convention:
test_*.py
Integration Tests
- Place in
integration/directory - Test component interactions
- May require database connection
- Follow naming convention:
test_*.py
Debug Scripts
- Place in
debug/directory - Use for troubleshooting specific issues
- Can be temporary or permanent
- Follow naming convention:
debug_*.pyorcheck_*.py
🐛 Debugging Tests
If tests fail:
- Check database connection: Ensure MusicBrainz database is running
- Check dependencies: Ensure all requirements are installed
- Check environment: Ensure you're running from the correct directory
- Use debug scripts: Run debug scripts in
debug/directory for troubleshooting
📊 Test Coverage
The test suite covers:
- ✅ Data loading and validation
- ✅ Artist name normalization
- ✅ Collaboration detection
- ✅ Database connectivity
- ✅ CLI functionality
- ✅ Edge cases and error handling
- ✅ Fuzzy search algorithms
- ✅ Recording count prioritization
🔄 Continuous Integration
Tests are automatically run:
- On pull requests
- Before releases
- During development
All tests must pass before code is merged.