Next release suggestion

#2
by yukiarimo - opened

Hello!

When you will release the next model can you please make it with the following configuration:

  • Input: visual (up to 4k, but if small, make image tokenizer super efficient) + audio (48 kHz) + text
  • Output text + audio (48 kHz) ONLY. No images!
  • Architecture: no Diffusion!
  • Size: ~4B dense

Thanks!

Lychee Team org

Hello @yukiarimo ,

Thank you for your detailed and thoughtful feature request for a future model. It's great to see such specific and well-considered ideas from the community.

We will be sure to share updates on our progress and any new model releases through our official channels. Thanks again for your passion and for sharing your vision with us.

Best regards,

Lychee Team

Sign up or log in to comment