Spaces:
Running
on
Zero
Running
on
Zero
title: Audio Difficulty Estimator | |
emoji: πΉ | |
colorFrom: purple | |
colorTo: pink | |
sdk: gradio | |
sdk_version: "4.26.0" | |
python_version: 3.10.13 | |
app_file: app.py | |
pinned: false | |
tags: | |
- music | |
- audio | |
- piano | |
- difficulty-estimation | |
short_description: Estimate piano difficulty from audio | |
hardware: "a100-large" | |
# πΌ Music Difficulty Estimator | |
This Gradio app estimates the **difficulty of piano pieces** based on uploaded audio (MP3/MP4) or YouTube links. It uses pretrained models to generate a MIDI transcription and predict difficulty from three musical perspectives: | |
- CQT-based representation | |
- Piano roll representation | |
- Multimodal embeddings | |
## π How it works | |
1. You upload an audio or video file, or paste a YouTube link. | |
2. The audio is transcribed to MIDI using a piano transcription model. | |
3. Three different difficulty models analyze the audio and generate predictions. | |
4. You can listen to the extracted MP3 and the generated MIDI. | |
## π¦ Model loading | |
All models are stored separately in the [pramoneda/audio](https://huggingface.co/pramoneda/audio) model repository and are downloaded dynamically via `huggingface_hub`. | |
## π Input formats | |
- MP3 audio | |
- MP4 video (audio extracted automatically) | |
- YouTube links | |
## β¨ Built with | |
- `gradio` for the interface | |
- `pydub` and `yt_dlp` for audio processing | |
- `huggingface_hub` to load model checkpoints | |
- `ffmpeg-python` for format conversion | |
## π Related | |
- [Model repo: pramoneda/audio](https://huggingface.co/pramoneda/audio) | |
- [More projects by pramoneda](https://huggingface.co/pramoneda) | |
--- | |