Great Release! And Good initiative on the tss-voices - small feedback

#1
by ExeVirus - opened

My only feedback is that I am unwilling to be the only donator to such a voice set. At a sample size of 20 voices, there's a high chance of my voice being over-leveraged by AI in general.

I would ask, if this initial request for voices falls flat (too few donations), that you attempt a "all or nothing" voice set, that lets people donate their voices, only if the released CC0 set has like 5,000 voice samples or more. That would greatly improve the willingness of all participants to donate, as they are just one small piece of a big puzzle. And the dataset could grow from there, and with each new voice, reduce the likelihood of any one person's voice being over-used.

Second, you may have to consider restricting children voice donations (i.e. under 18 or 17 or whatever rules on age of adulthood).

ExeVirus changed discussion title from Great Release! And Good initiative on the tss-voices - small aside to Great Release! And Good initiative on the tss-voices - small feedback

There an LJSpeech voice here?

I would ask, if this initial request for voices falls flat (too few donations), that you attempt a "all or nothing" voice set, [...]

or how about instead of an "all or nothing" voice set, just offer an additional way of a "conditional" opt-in? Meaning all voices that are donated unconditionally can be released no matter what, and in addition there are donated voices, that only get added to the voice set if at least X unique voices are reached in the voice set.

That would fulfill your condition without risking unconditional voice donations of other contributors with an "all or nothing" approach. It would also allow the voice set to be incrementally released, and when enough voices were accumulated then the conditionally donated voices can get released on top. Would maybe also incentivize even more people to donate their voice if they can unlock even more voices with their own voice donation... They could see a message like "we currently need 2 more voice donations to release 100 conditionally donated voices"

What about voices generated through platforms like ElevenLabs? I see there's no American voices that are open to commercial use and would like to have a set of voices to work to be available for it. If you offered uploading an audio file could you detect their watermark? Or would that be too much of a licensing problem?

If anyone here needs to remove ElevenLabs' watermark, I can do that for a small fee.

Great work and your example of your voice for voice cloning is priceless. I would use Kyutai if it released a way for users to use their own embeddings. Currently it's like having a great voice model but without the creativity. I just want to be able to have personal assistants with different voices that I choose not what was provided.

If you want a custom voice, VITS is your best friend!

Sign up or log in to comment