File size: 1,929 Bytes
2280846
e51cd56
 
2280846
e51cd56
2280846
2dbd359
2280846
 
 
 
 
e51cd56
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
---
title: Multimodal AI Search Engine
emoji: πŸ”
colorFrom: blue
colorTo: purple
sdk: gradio
sdk_version: 5.42.0
app_file: app.py
pinned: false
license: mit
---

# πŸ” Multimodal AI Search Engine

A sophisticated image search engine that enables both text-to-image and image-to-image similarity search using state-of-the-art deep learning models.

## 🌟 Features

- **πŸ”€ Text-to-Image Search**: Find images using natural language descriptions
- **πŸ–ΌοΈ Image-to-Image Search**: Upload an image to find visually similar ones  
- **⚑ Fast Search**: Sub-second query response times using FAISS indexing
- **🎯 High Accuracy**: Powered by OpenAI's CLIP-ViT-B-32 model
- **🎨 Modern UI**: Clean, responsive Gradio interface

## πŸš€ How It Works

1. **First Visit**: The app automatically downloads 500 images from Caltech101 dataset
2. **Embedding Generation**: Creates CLIP embeddings for all images using ViT-B-32 model
3. **Index Building**: Builds FAISS index for fast similarity search
4. **Ready to Search**: Use text descriptions or upload images to find similar content

## πŸ”§ Technology Stack

- **CLIP-ViT-B-32**: OpenAI's vision-language model
- **FAISS**: Facebook's similarity search library
- **Gradio**: Interactive web interface
- **Caltech101**: 500 diverse images across 101 categories

## πŸ“Š Dataset

- **Source**: Caltech101 via HuggingFace
- **Size**: 500 randomly sampled images
- **Categories**: 101 different object classes
- **Auto-Setup**: Downloads and processes on first run

## πŸ’‘ Usage Tips

- **Text Search**: Use descriptive phrases like "red car on road" or "cat sitting"
- **Image Search**: Upload any image to find visually similar ones
- **Results**: Adjust the number of results using the slider (1-20)
- **First Load**: May take 5-10 minutes to set up dataset initially

*Note: First-time setup may take several minutes as the app downloads and processes the image dataset.*