can not believe, but seems 256M is slower then internvl-1B ?

#25

by josefph - opened about 1 month ago

about 1 month ago

•

As title said, it's hard to believe that smolvlm-256M-instruct is slower then internvl-1B. Even i inspect the input embedding and params still can not figure out why ?

internvl-1B >
inp_embed : (1, 547, 896)
trainable params: 17,596,416 || all params: 647,260,288 || trainable%: 2.7186

smolvlm-256M >
inp_embed : (1, 171, 576)
trainable params: 9,768,960 || all params: 172,742,976 || trainable%: 5.6552

Does someone have similar issue ??

AutoDriving ms :

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment