Text-to-Speech
Safetensors
GGUF
qwen2
audio
speech
speech-language-models
conversational

Plans for audio tags

#11
by BlindTech - opened

Hi đź‘‹

Are there any plans for adding audio / performance tags such as [laughs], [sighs], [gulps] and so forth? Also when it comes to speech style, is it only possible to control it via the reference audio? For example I want speech to be calm in general, but based on context it might need to become more expressive / emotional. Maybe any plans around that as well?

Thank you in advance!

Neuphonic org
•
edited 4 days ago

hey! no plans for audio tags at the moment, though they’re definitely possible within the framework - so for now it’s only possible to control style with the reference. might be worth checking out the fine-tuning discussion over here!

Sign up or log in to comment