bsoupy/RAGExplo · Hugging Face

library_name: transformerstags: - image-to-text - clip - t5 - sentence-transformers - ragpipeline_tag: image-to-textlicense: apache-2.0 RAG Image Captioning Model This is a RAG-based image captioning model using CLIP (openai/clip-vit-base-patch32), T5 (t5-small), and SentenceTransformer (all-MiniLM-L6-v2). It retrieves similar captions from a FAISS index and generates a caption using T5. Files

inference.py: Custom inference script with a predict function. requirements.txt: Dependencies. faiss_index.idx: FAISS index for retrieval. captions.json: Caption corpus.

Usage Upload an image to generate a caption. Designed for API integration via Hugging Face Spaces or custom deployment. Setup Install dependencies from requirements.txt and ensure en_core_web_sm is installed for spaCy: pip install -r requirements.txt python -m spacy download en_core_web_sm

bsoupy
/

RAGExplo

Model tree for bsoupy/RAGExplo