Spaces:
Sleeping
Sleeping
File size: 2,771 Bytes
d57efe3 a56dfb0 0ae99f6 d57efe3 6d854eb d57efe3 a56dfb0 d57efe3 51914db d57efe3 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 |
---
title: Florence-2 Vision Tasks Demo
emoji: π
colorFrom: green
colorTo: blue
sdk: gradio
sdk_version: 5.39.0
app_file: app.py
pinned: true
short_description: This is a Gradio-based demo showcasing Florence-2
license: mit
---
# Florence-2 Demo: Advancing a Unified Representation for a Variety of Vision Tasks
This is a Gradio-based demo showcasing **Florence-2**, a unified vision foundation model that advances the state-of-the-art in various computer vision tasks through a single, versatile architecture.
## Demo Preview

## About Florence-2
Florence-2 represents a significant breakthrough in computer vision by providing a unified representation that can handle a diverse range of vision tasks including:
- Object detection
- Image captioning
- Visual question answering
- OCR (Optical Character Recognition)
- Region proposal
- Segmentation
- And many more vision tasks
The model demonstrates how a single architecture can be effectively applied across multiple vision domains, eliminating the need for task-specific models.
## Paper & Resources
π **CVPR 2024 Paper**: [Florence-2: Advancing a Unified Representation for a Variety of Vision Tasks](https://openaccess.thecvf.com/content/CVPR2024/papers/Xiao_Florence-2_Advancing_a_Unified_Representation_for_a_Variety_of_Vision_CVPR_2024_paper.pdf)
π₯ **CVPR Virtual Presentation**: [https://cvpr.thecvf.com/virtual/2024/poster/30529](https://cvpr.thecvf.com/virtual/2024/poster/30529)
πΌοΈ **Research Poster**: [Poster.png](./Poster.png)
## Demo Features
This Gradio demo allows you to:
- Upload images and interact with Florence-2's various capabilities
- Test different vision tasks on your own images
- Experience the unified model's performance across multiple domains
## Getting Started
1. Install the required dependencies:
```bash
pip install -r requirements.txt
```
2. Run the demo:
```bash
python app.py
```
3. Open your browser and navigate to the provided local URL to start using the demo.
## References
**Hugging Face Spaces**:
- [Florence-2 Demo by gokaygokay](https://huggingface.co/spaces/gokaygokay/Florence-2)
- [Florence-SAM Integration by SkalskiP](https://huggingface.co/spaces/SkalskiP/florence-sam)
## Citation
If you use this demo or find Florence-2 useful in your research, please cite:
```bibtex
@inproceedings{xiao2024florence,
title={Florence-2: Advancing a unified representation for a variety of vision tasks},
author={Xiao, Bin and Wu, Haiping and Xu, Weijian and Dai, Xiyang and Hu, Houdong and Lu, Yumao and Zeng, Michael and Liu, Ce and Yuan, Lu},
booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition},
pages={4818--4829},
year={2024}
}
``` |