audio-difficulty / README.md
PRamoneda
probando
ff9ad90

A newer version of the Gradio SDK is available: 5.33.1

Upgrade
metadata
title: Audio Difficulty Estimator
emoji: 🎹
colorFrom: purple
colorTo: pink
sdk: gradio
sdk_version: 4.26.0
python_version: 3.10.13
app_file: app.py
pinned: false
tags:
  - music
  - audio
  - piano
  - difficulty-estimation
short_description: Estimate piano difficulty from audio
hardware: a100-large

🎼 Music Difficulty Estimator

This Gradio app estimates the difficulty of piano pieces based on uploaded audio (MP3/MP4) or YouTube links. It uses pretrained models to generate a MIDI transcription and predict difficulty from three musical perspectives:

  • CQT-based representation
  • Piano roll representation
  • Multimodal embeddings

πŸ›  How it works

  1. You upload an audio or video file, or paste a YouTube link.
  2. The audio is transcribed to MIDI using a piano transcription model.
  3. Three different difficulty models analyze the audio and generate predictions.
  4. You can listen to the extracted MP3 and the generated MIDI.

πŸ“¦ Model loading

All models are stored separately in the pramoneda/audio model repository and are downloaded dynamically via huggingface_hub.

πŸ“ Input formats

  • MP3 audio
  • MP4 video (audio extracted automatically)
  • YouTube links

✨ Built with

  • gradio for the interface
  • pydub and yt_dlp for audio processing
  • huggingface_hub to load model checkpoints
  • ffmpeg-python for format conversion

πŸ”— Related