The important question

#4
by yukiarimo - opened

So, how about releasing the full dataset? Or you have just illegally ripped off stolen voices from the web?

surely this is the best way to ask for anything

most companies/developers wouldn't release a training dataset, even when the model is open source. this is not unusual.

most companies/developers wouldn't release a training dataset, even when the model is open source. this is not unusual.

  1. A bunch of projects like VITS, Tacotron, etc., have released! (And usually they use LJSpeech)
  2. If you not even say where the data is coming from, it's definitely 100% stolen and they MUST be banned from HF!
  1. A bunch of projects like VITS, Tacotron, etc., have released! (And usually they use LJSpeech)
  2. If you not even say where the data is coming from, it's definitely 100% stolen and they MUST be banned from HF!

right, hf should ban 95% models include gpt, llama, gemma as well. none of them have release datasets lol

btw, maya actully notes training data in the metadata

Sign up or log in to comment