Music Annotation Tool

Overview

This application is designed to streamline the process of annotating audio files from the Free Music Archive (FMA) Small dataset. It provides a user-friendly interface for listening to audio tracks, selecting segments, and adding detailed annotations including descriptions, instruments, moods, tempo, and genres.

Purpose

The primary purpose of this tool is to create a high-quality labeled dataset for training music generation AI models like MusicLM. By annotating audio segments with rich metadata, we can:

Train models to understand the relationship between textual descriptions and audio characteristics
Generate more accurate and contextually appropriate music based on text prompts
Create a structured dataset that captures musical attributes in a consistent format

Features

Waveform Visualization: Interactive audio waveform display with region selection
Audio Playback Controls: Play, pause, skip forward/backward, and loop selected regions
Automatic BPM Detection: Uses audio analysis to detect and suggest tempo values
Structured Annotation: Capture multiple dimensions of musical information:
- Textual descriptions
- Instruments present
- Emotional moods
- Tempo (BPM)
- Musical genres
Progress Tracking: Monitor annotation progress across the entire dataset
Audio Processing: Automatically processes selected segments for training (trimming, converting to mono, standardizing sample rate)

Technical Implementation

The application is built with:

Next.js: React framework for the frontend and API routes
Firebase: For storing annotation data and tracking progress
AWS S3: For storing and retrieving audio files
WaveSurfer.js: For audio visualization and interaction
Web Audio API: For audio processing and BPM detection
FFmpeg: For server-side audio processing

Workflow

The system automatically selects the next unannotated track from the FMA Small dataset
Users can listen to the track and select a meaningful segment (default is 10 seconds)
The BPM is automatically detected and rounded to the nearest 10 for consistency
Users add descriptive text and select relevant instruments, moods, and genres
Upon saving, the selected audio segment is processed (trimmed, converted to mono, resampled)
The processed audio and annotations are stored for later use in AI training
The system advances to the next track automatically

Extensibility

While designed specifically for the FMA Small dataset, this tool can be easily adapted for:

Other music datasets by modifying the track selection logic
Sound effect libraries by adjusting the annotation categories
Voice or speech datasets by changing the metadata fields
Any audio annotation task requiring segment selection and structured labeling

Getting Started

Clone the repository
Install dependencies with npm install
Configure environment variables for Firebase and AWS S3
Run the development server with npm run dev
Access the application at http://localhost:3000

Environment Variables

The following environment variables need to be set:

S3_BUCKET_NAME=your-bucket-name
S3_REGION=your-region
AWS_ACCESS_KEY_ID=your-access-key
AWS_SECRET_ACCESS_KEY=your-secret-key
FIREBASE_PROJECT_ID=your-project-id
FIREBASE_PRIVATE_KEY=your-private-key
FIREBASE_CLIENT_EMAIL=your-client-email

Future Improvements

Support for batch annotation to increase efficiency
Integration with AI-assisted annotation suggestions
Enhanced audio analysis for more detailed feature extraction
Export functionality for different AI training formats
User management for collaborative annotation projects

License

This project is licensed under the MIT License - see the LICENSE file for details.

Contributing

We welcome contributions! Please fork the repository and create a pull request with your changes.

Name		Name	Last commit message	Last commit date
Latest commit History 9 Commits
.vscode		.vscode
public		public
src		src
.gitignore		.gitignore
README.md		README.md
components.json		components.json
eslint.config.mjs		eslint.config.mjs
next.config.ts		next.config.ts
package-lock.json		package-lock.json
package.json		package.json
postcss.config.mjs		postcss.config.mjs
tsconfig.json		tsconfig.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Music Annotation Tool

Overview

Purpose

Features

Technical Implementation

Workflow

Extensibility

Getting Started

Environment Variables

Future Improvements

License

Contributing

About

Uh oh!

Releases

Packages

Uh oh!

Languages

Fats403/music-annotation-tool

Folders and files

Latest commit

History

Repository files navigation

Music Annotation Tool

Overview

Purpose

Features

Technical Implementation

Workflow

Extensibility

Getting Started

Environment Variables

Future Improvements

License

Contributing

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages