Spaces:
Runtime error
Runtime error
title: Rex-Thinker Demo | |
emoji: π | |
colorFrom: blue | |
colorTo: purple | |
sdk: gradio | |
sdk_version: 4.44.1 | |
app_file: demo/app.py | |
pinned: false | |
license: apache-2.0 | |
# Rex-Thinker Demo | |
This is a demo application for Rex-Thinker-GRPO, a visual reasoning model that combines GroundingDINO for object detection with advanced referring expression comprehension. | |
## Features | |
- **Object Detection**: Uses GroundingDINO to detect objects based on category names | |
- **Referring Expression Comprehension**: Identifies specific objects based on detailed descriptions | |
- **Interactive Web Interface**: Easy-to-use Gradio interface with real-time streaming | |
- **Visual Reasoning**: Shows the model's thinking process with detailed explanations | |
## How to Use | |
1. **Upload an Image**: Click on "Input Image" to upload your image | |
2. **Set Object Category**: Enter the general category of objects you want to detect (e.g., "person", "car", "dog") | |
3. **Enter Referring Expression**: Provide a detailed description of the specific object you want to identify (e.g., "person wearing red shirt and black hat") | |
4. **Adjust Visualization Settings**: Modify draw width and font size for better visualization | |
5. **Run the Model**: Click "Run with Streaming" to see the results | |
## Examples | |
The demo includes several pre-loaded examples: | |
- Tomato detection | |
- Helmet identification | |
- Person in vehicle | |
- Text recognition on clothing | |
- Pet detection | |
## Technical Details | |
- **Base Model**: Rex-Thinker-GRPO-7B | |
- **Object Detection**: GroundingDINO with SwinT backbone | |
- **Framework**: Gradio for web interface | |
- **Inference**: Supports streaming text generation | |
## Model Information | |
Rex-Thinker-GRPO is a multimodal reasoning model that: | |
1. Uses GroundingDINO to propose candidate object locations | |
2. Applies visual reasoning to identify specific objects based on referring expressions | |
3. Provides detailed explanations of its reasoning process | |
4. Outputs precise bounding box coordinates for detected objects | |
For more information, visit the [original repository](https://github.com/IDEA-Research/Rex-Thinker-GRPO). |