NihalGazi/Text-To-Speech-Unlimited · Additional pre-trained voices

AppAgility

14 days ago

•

edited 14 days ago

Nice Demo NihalGazi

I assume you using OpenAI Text-to-Speech (TTS) API?

OpenAI’s API does not seem to support an emotion parameter (emotion=ENCODED_EMOTION), What TTS API endpoint are you using?
where are you getting the additional pre-trained voices?
like: "coral", "verse", "ballad", "ash", "sage", "amuch", "dan" ?
Are you using any additional open-source TTS projects (like Bark, Coqui TTS, or custom Hugging Face Space) ?
How does OpenAI Text-to-Speech (TTS) API know how to deal with these additionally selected voices?

NihalGazi

Owner 14 days ago

Nice Demo NihalGazi

I assume you using OpenAI Text-to-Speech (TTS) API, but where are you getting the additional pre-trained voices?
like: "coral", "verse", "ballad", "ash", "sage", "amuch", "dan" ?
Are you using any additional open-source TTS projects (like Bark, Coqui TTS, or custom Hugging Face Space) ?
How does OpenAI Text-to-Speech (TTS) API know how to deal with these additionally selected voices?

Hey there! Thanks for the feedback.

Beyond OpenAI's default TTS model, there is another smaller set of models called the gpt-4o-mini-tts, which has additional voices that are much more customisable than the default voices.

RuneLightLovely

9 days ago

wait, how am i suppose to make my OWN voice? 🤨

NihalGazi

Owner 9 days ago

wait, how am i suppose to make my OWN voice? 🤨

You can't. This isn't actually "voice cloning". It's "Text-to-speech". For voice cloning, you can check out XTTS, F5-TTS or CHATTERBOX TTS.

I'll try to implement voice cloning ASAP.