Spaces:
Configuration error
Configuration error
File size: 2,510 Bytes
5e287ff |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 |
# Visual ChatGPT
**Visual ChatGPT** connects ChatGPT and a series of Visual Foundation Models to enable **sending** and **receiving** images during chatting.
See our paper: [<font size=5>Visual ChatGPT: Talking, Drawing and Editing with Visual Foundation Models</font>](https://arxiv.org/abs/2303.04671)
## Visual ChatGPT Colab Support
You can run the colab notebook with following models text to image,ImageCaptioning,BLIP VQA,image to canny
[](https://colab.research.google.com/drive/1vhF4f3091h1cHZUh5QK7qByBHUDKbSWA?usp=sharing)
## Demo
<img src="./assets/demo_short.gif" width="750">
## System Architecture
<p align="center"><img src="./assets/figure.jpg" alt="Logo"></p>
## Quick Start
```
# create a new environment
conda create -n visgpt python=3.8
# activate the new environment
conda activate visgpt
# prepare the basic environments
pip install -r requirement.txt
# download the visual foundation models
bash download.sh
# prepare your private openAI private key
export OPENAI_API_KEY={Your_Private_Openai_Key}
# create a folder to save images
mkdir ./image
# Start Visual ChatGPT !
python visual_chatgpt.py
```
## GPU memory usage
Here we list the GPU memory usage of each visual foundation model, one can modify ``self.tools`` with fewer visual foundation models to save your GPU memory:
| Fundation Model | Memory Usage (MB) |
|------------------------|-------------------|
| ImageEditing | 6667 |
| ImageCaption | 1755 |
| T2I | 6677 |
| canny2image | 5540 |
| line2image | 6679 |
| hed2image | 6679 |
| scribble2image | 6679 |
| pose2image | 6681 |
| BLIPVQA | 2709 |
| seg2image | 5540 |
| depth2image | 6677 |
| normal2image | 3974 |
| Pix2Pix | 2795 |
## Acknowledgement
We appreciate the open source of the following projects:
- HuggingFace [[Project]](https://github.com/huggingface/transformers)
- ControlNet [[Paper]](https://arxiv.org/abs/2302.05543) [[Project]](https://github.com/lllyasviel/ControlNet)
- Stable Diffusion [[Paper]](https://arxiv.org/abs/2112.10752) [[Project]](https://github.com/CompVis/stable-diffusion)
|