aswnrj's picture
Update README.md
2dbd359 verified

A newer version of the Gradio SDK is available: 5.44.1

Upgrade
metadata
title: Multimodal AI Search Engine
emoji: πŸ”
colorFrom: blue
colorTo: purple
sdk: gradio
sdk_version: 5.42.0
app_file: app.py
pinned: false
license: mit

πŸ” Multimodal AI Search Engine

A sophisticated image search engine that enables both text-to-image and image-to-image similarity search using state-of-the-art deep learning models.

🌟 Features

  • πŸ”€ Text-to-Image Search: Find images using natural language descriptions
  • πŸ–ΌοΈ Image-to-Image Search: Upload an image to find visually similar ones
  • ⚑ Fast Search: Sub-second query response times using FAISS indexing
  • 🎯 High Accuracy: Powered by OpenAI's CLIP-ViT-B-32 model
  • 🎨 Modern UI: Clean, responsive Gradio interface

πŸš€ How It Works

  1. First Visit: The app automatically downloads 500 images from Caltech101 dataset
  2. Embedding Generation: Creates CLIP embeddings for all images using ViT-B-32 model
  3. Index Building: Builds FAISS index for fast similarity search
  4. Ready to Search: Use text descriptions or upload images to find similar content

πŸ”§ Technology Stack

  • CLIP-ViT-B-32: OpenAI's vision-language model
  • FAISS: Facebook's similarity search library
  • Gradio: Interactive web interface
  • Caltech101: 500 diverse images across 101 categories

πŸ“Š Dataset

  • Source: Caltech101 via HuggingFace
  • Size: 500 randomly sampled images
  • Categories: 101 different object classes
  • Auto-Setup: Downloads and processes on first run

πŸ’‘ Usage Tips

  • Text Search: Use descriptive phrases like "red car on road" or "cat sitting"
  • Image Search: Upload any image to find visually similar ones
  • Results: Adjust the number of results using the slider (1-20)
  • First Load: May take 5-10 minutes to set up dataset initially

Note: First-time setup may take several minutes as the app downloads and processes the image dataset.