README.md · bsoupy/RAGExplo at main

metadata

license: apache-2.0
language:
  - en
base_model:
  - openai/clip-vit-base-patch32
  - sentence-transformers/all-MiniLM-L6-v2
  - google-t5/t5-small
pipeline_tag: image-text-to-text
library_name: transformers

library_name: transformerstags: - image-to-text - clip - t5 - sentence-transformers - ragpipeline_tag: image-to-textlicense: apache-2.0 RAG Image Captioning Model This is a RAG-based image captioning model using CLIP (openai/clip-vit-base-patch32), T5 (t5-small), and SentenceTransformer (all-MiniLM-L6-v2). It retrieves similar captions from a FAISS index and generates a caption using T5. Files

inference.py: Custom inference script with a predict function. requirements.txt: Dependencies. faiss_index.idx: FAISS index for retrieval. captions.json: Caption corpus.

Usage Upload an image to generate a caption. Designed for API integration via Hugging Face Spaces or custom deployment. Setup Install dependencies from requirements.txt and ensure en_core_web_sm is installed for spaCy: pip install -r requirements.txt python -m spacy download en_core_web_sm