Next release suggestion
#2
by
yukiarimo
- opened
Hello!
When you will release the next model can you please make it with the following configuration:
- Input: visual (up to 4k, but if small, make image tokenizer super efficient) + audio (48 kHz) + text
- Output text + audio (48 kHz) ONLY. No images!
- Architecture: no Diffusion!
- Size: ~4B dense
Thanks!
Hello @yukiarimo ,
Thank you for your detailed and thoughtful feature request for a future model. It's great to see such specific and well-considered ideas from the community.
We will be sure to share updates on our progress and any new model releases through our official channels. Thanks again for your passion and for sharing your vision with us.
Best regards,
Lychee Team