CASLIE-M

This repo contains the models for "Captions Speak Louder than Images (CASLIE): Generalizing Foundation Models for E-commerce from High-quality Multimodal Instruction Data"

CASLIE Models

The CASLIE-M model is instruction-tuned from the medium-size base model Mistral-7B-Instruct-v0.3.

Citation

@article{ling2024captions,
    title={Captions Speak Louder than Images (CASLIE): Generalizing Foundation Models for E-commerce from High-quality Multimodal Instruction Data},
    author={Ling, Xinyi and Peng, Bo and Du, Hanwen and Zhu, Zhihui and Ning, Xia},
    journal={arXiv preprint arXiv:2410.17337},
    year={2024}
}