library_name: transformerstags: - image-to-text - clip - t5 - sentence-transformers - ragpipeline_tag: image-to-textlicense: apache-2.0 RAG Image Captioning Model This is a RAG-based image captioning model using CLIP (openai/clip-vit-base-patch32), T5 (t5-small), and SentenceTransformer (all-MiniLM-L6-v2). It retrieves similar captions from a FAISS index and generates a caption using T5. Files
inference.py: Custom inference script with a predict function. requirements.txt: Dependencies. faiss_index.idx: FAISS index for retrieval. captions.json: Caption corpus.
Usage Upload an image to generate a caption. Designed for API integration via Hugging Face Spaces or custom deployment. Setup Install dependencies from requirements.txt and ensure en_core_web_sm is installed for spaCy: pip install -r requirements.txt python -m spacy download en_core_web_sm
Model tree for bsoupy/RAGExplo
Base model
google-t5/t5-small