12 KiB
Product Requirements Document (PRD)
Music Charts Archive Scraper & Analytics Platform
📋 Executive Summary
A full-stack web application that scrapes, analyzes, and visualizes music chart data from the Music Charts Archive website. The platform provides both weekly chart views and comprehensive yearly rankings, enabling users to explore historical music trends and download data for further analysis.
🎯 Product Vision
Create a comprehensive music analytics platform that transforms raw chart data into actionable insights, making historical music trends accessible and analyzable for researchers, music enthusiasts, and data analysts.
🎵 Core Features
1. Data Scraping Engine
- Weekly Chart Scraping: Extract song rankings from individual chart dates
- Yearly Data Aggregation: Collect and analyze data across all available dates for a year
- Real-time Data Fetching: Live scraping from Music Charts Archive website
- Error Handling: Graceful handling of missing data and network issues
2. Weekly Chart View
- Date Selection: Browse available chart dates by year
- Chart Display: View top 50 songs with rankings, titles, and artists
- Data Export: Download weekly chart data in JSON format
- Responsive Design: Mobile and desktop optimized interface
3. Yearly Analytics
- Yearly Rankings: Calculate top songs of the year based on:
- Total points (position-based scoring)
- Highest achieved position
- Number of appearances
- Smart Algorithm: Multi-factor ranking system
- Yearly Export: Download comprehensive yearly data
4. Data Export System
- JSON Format: Structured data export with metadata
- File Naming: Date/year-based file naming convention
- Metadata Inclusion: Chart date, total songs, formatted titles
🔧 Technical Architecture
Backend Requirements
Core Technologies (Current: Node.js/Express)
- Server Framework: Node.js/Express, Python/Flask, Java/Spring Boot, C#/.NET, Go/Gin
- Web Scraping: Cheerio (Node.js), BeautifulSoup (Python), Jsoup (Java), HtmlAgilityPack (C#), goquery (Go)
- HTTP Client: Axios (Node.js), requests (Python), OkHttp (Java), HttpClient (C#), net/http (Go)
- Data Processing: JavaScript, Python, Java, C#, Go
API Endpoints
GET /api/health
- Purpose: Health check endpoint
- Response: Status confirmation
GET /api/chart/dates/:year
- Purpose: Get available chart dates for a year
- Parameters: year (number)
- Response: Array of date objects with date and formattedDate
GET /api/chart/data/:date
- Purpose: Get chart data for specific date
- Parameters: date (YYYY-MM-DD format)
- Response: Array of song objects with order, title, artist
GET /api/chart/yearly-top/:year
- Purpose: Get yearly top songs ranking
- Parameters: year (number)
- Response: Array of top 50 songs with order, title, artist
Data Models
// Date Object
{
date: "2024-01-20",
formattedDate: "Jan 20, 2024"
}
// Song Object
{
order: 1,
title: "Lovin On Me",
artist: "Jack Harlow"
}
// Weekly Chart Export
{
title: "January 20, 2024 - Top Songs",
date: "2024-01-20",
totalSongs: 50,
songs: [Song Object Array]
}
// Yearly Chart Export
{
year: 2024,
title: "Top Songs of 2024",
songs: [Song Object Array]
}
Scraping Logic
- Date Extraction: Parse HTML for chart date links
- Song Data Extraction: Parse table rows for song information
- Data Validation: Ensure data integrity and completeness
- Error Recovery: Skip problematic dates and continue processing
Yearly Ranking Algorithm
// Scoring System
- Position Points: 1st = 50 points, 50th = 1 point
- Highest Position: Track best position achieved
- Appearance Count: Number of weeks on charts
// Sorting Priority
1. Total Points (descending)
2. Highest Position (ascending)
3. Appearance Count (descending)
Frontend Requirements
Core Technologies (Current: React)
- UI Framework: React, Vue.js, Angular, Svelte, Flutter (mobile), SwiftUI (iOS), Jetpack Compose (Android)
- State Management: React Hooks, Vuex, Redux, NgRx, Svelte stores
- HTTP Client: Axios, fetch API, HttpClient
- Styling: CSS3, SCSS, Tailwind CSS, Material-UI, Bootstrap
Component Architecture
App
├── Layout
├── YearSelector
├── ViewModeSelector (Weekly/Yearly)
├── DateList (Weekly mode)
├── ChartTable
├── YearlyTopSongs
├── DownloadButton
└── YearlyDownloadButton
State Management
// Core State
{
selectedYear: number,
selectedDate: string | null,
viewMode: 'weekly' | 'yearly',
dates: DateObject[],
chartData: SongObject[],
yearlySongs: SongObject[]
}
// Loading States
{
datesLoading: boolean,
dataLoading: boolean,
yearlyLoading: boolean
}
// Error States
{
datesError: string | null,
dataError: string | null,
yearlyError: string | null
}
User Interface Requirements
- Responsive Design: Mobile-first approach
- Loading States: Visual feedback during data fetching
- Error Handling: User-friendly error messages
- Accessibility: WCAG 2.1 compliance
- Cross-browser Compatibility: Modern browser support
📱 Mobile Platform Considerations
iOS (Swift/SwiftUI)
// Data Models
struct ChartDate: Codable {
let date: String
let formattedDate: String
}
struct Song: Codable {
let order: Int
let title: String
let artist: String
}
// Network Layer
class ChartAPIService {
func fetchChartDates(year: Int) async throws -> [ChartDate]
func fetchChartData(date: String) async throws -> [Song]
func fetchYearlyTopSongs(year: Int) async throws -> [Song]
}
// UI Components
struct YearSelector: View
struct ChartTableView: View
struct YearlyTopSongsView: View
struct DownloadButton: View
Android (Kotlin/Jetpack Compose)
// Data Models
data class ChartDate(
val date: String,
val formattedDate: String
)
data class Song(
val order: Int,
val title: String,
val artist: String
)
// Network Layer
class ChartAPIService {
suspend fun fetchChartDates(year: Int): List<ChartDate>
suspend fun fetchChartData(date: String): List<Song>
suspend fun fetchYearlyTopSongs(year: Int): List<Song>
}
// UI Components
@Composable fun YearSelector()
@Composable fun ChartTable()
@Composable fun YearlyTopSongs()
@Composable fun DownloadButton()
🎵 Data Flow
Weekly Chart Flow
- User selects year → Fetch available dates
- User selects date → Fetch chart data
- Display chart table → Enable download
- User downloads → Generate JSON file
Yearly Chart Flow
- User selects year → Fetch all dates for year
- System scrapes all dates → Calculate rankings
- Display yearly top songs → Enable download
- User downloads → Generate yearly JSON file
🛠 Development Guidelines
Backend Development
- Error Handling: Implement comprehensive error handling
- Rate Limiting: Respect website terms of service
- Caching: Implement caching for frequently accessed data
- Logging: Comprehensive logging for debugging
- Testing: Unit tests for core algorithms
Frontend Development
- Component Reusability: Create reusable UI components
- State Management: Centralized state management
- Performance: Optimize for large datasets
- User Experience: Smooth loading and error states
- Accessibility: Screen reader and keyboard navigation support
Mobile Development
- Offline Support: Cache data for offline viewing
- Native Features: Share functionality, file downloads
- Performance: Optimize for mobile devices
- Platform Guidelines: Follow iOS/Android design guidelines
📊 Success Metrics
Technical Metrics
- API Response Time: < 2 seconds for chart data
- Scraping Success Rate: > 95% successful data extraction
- Error Rate: < 5% failed requests
- Uptime: > 99% availability
User Experience Metrics
- Page Load Time: < 3 seconds
- User Engagement: Time spent on platform
- Download Rate: Percentage of users downloading data
- Error Recovery: Successful error resolution rate
🔒 Security & Compliance
Data Protection
- Rate Limiting: Prevent abuse of scraping endpoints
- User Agent: Proper identification in requests
- Error Handling: No sensitive data in error messages
- CORS: Proper cross-origin resource sharing
Legal Considerations
- Terms of Service: Respect Music Charts Archive terms
- Data Usage: Educational and research purposes
- Attribution: Proper credit to data source
- Rate Limiting: Responsible scraping practices
🚀 Future Enhancements
Phase 2 Features
- Historical Trends: Visual charts showing song performance over time
- Artist Analytics: Artist-specific statistics and rankings
- Genre Analysis: Genre-based filtering and analysis
- Advanced Filtering: Date range, artist, song title filters
Phase 3 Features
- User Accounts: Save favorite charts and analyses
- Social Features: Share charts and insights
- API Access: Public API for developers
- Data Visualization: Interactive charts and graphs
📝 Implementation Checklist
Backend Setup
- Choose server framework and language
- Set up web scraping library
- Implement API endpoints
- Create data models
- Implement yearly ranking algorithm
- Add error handling and logging
- Set up testing framework
Frontend Setup
- Choose UI framework
- Set up state management
- Create reusable components
- Implement API integration
- Add download functionality
- Implement responsive design
- Add loading and error states
Mobile Setup (if applicable)
- Set up mobile development environment
- Create data models
- Implement network layer
- Create UI components
- Add native features (download, share)
- Test on devices
🎯 Current Implementation Details
Tech Stack
- Backend: Node.js, Express.js, Cheerio, Axios
- Frontend: React 18, Custom Hooks, CSS3
- Development: npm, nodemon
Project Structure
m/
├── frontend/ # React frontend application
│ ├── src/
│ │ ├── components/ # React components
│ │ ├── hooks/ # Custom React hooks
│ │ ├── services/ # API service layer
│ │ ├── App.jsx # Main app component
│ │ ├── index.js # Entry point
│ │ └── styles.css # Global styles
│ ├── public/
│ └── package.json
└── backend/ # Node.js/Express backend API
├── src/
│ ├── routes/ # API routes
│ ├── controllers/ # Route controllers
│ ├── models/ # Data models & scraping logic
│ ├── middleware/ # Express middleware
│ └── server.js # Main server file
└── package.json
Key Files
start.sh- Startup script for both servicesREADME.md- Setup and usage instructionsPRD.md- This product requirements document
This PRD provides a comprehensive blueprint for recreating the Music Charts Archive Scraper in any technology stack, from web frameworks to mobile platforms. The modular architecture and clear data flow make it adaptable to different implementation approaches while maintaining the core functionality and user experience.