Spaces:
Running
on
Zero
Running
on
Zero
title: L Operator Demo | |
emoji: π | |
colorFrom: purple | |
colorTo: green | |
sdk: gradio | |
sdk_version: 5.44.0 | |
app_file: app.py | |
pinned: true | |
license: gpl | |
short_description: demo of l-operator with no commands | |
Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference | |
# π€ L-Operator: Android Device Control Demo | |
A complete multimodal Gradio demo for the [L-Operator model](https://huggingface.co/Tonic/l-android-control), a fine-tuned multimodal AI agent based on LiquidAI's LFM2-VL-1.6B model, optimized for Android device control through visual understanding and action generation. | |
## π Features | |
- **Multimodal Interface**: Upload Android screenshots and provide text instructions | |
- **Chat Interface**: Interactive chat with the model using Gradio's ChatInterface component | |
- **Action Generation**: Generate JSON actions for Android device control | |
- **Example Episodes**: Pre-loaded examples from extracted training episodes | |
- **Real-time Processing**: Optimized for real-time inference | |
- **Beautiful UI**: Modern, responsive interface with comprehensive documentation | |
- **β‘ ZeroGPU Compatible**: Dynamic GPU allocation for cost-effective deployment | |
## π Model Details | |
| Property | Value | | |
|----------|-------| | |
| **Base Model** | [LiquidAI/LFM2-VL-1.6B](https://huggingface.co/LiquidAI/LFM2-VL-1.6B) | | |
| **Architecture** | LFM2-VL (1.6B parameters) | | |
| **Fine-tuning** | LoRA (Low-Rank Adaptation) | | |
| **Training Data** | Android control episodes with screenshots and actions | | |
| **License** | Proprietary (Investment Access Required) | | |
## π Quick Start | |
### Prerequisites | |
1. **Python 3.8+**: Ensure you have Python 3.8 or higher installed | |
2. **Hugging Face Access**: Request access to the [L-Operator model](https://huggingface.co/Tonic/l-android-control) | |
3. **Authentication**: Login to Hugging Face using `huggingface-cli login` | |
### Installation | |
1. **Clone the repository**: | |
```bash | |
git clone <repository-url> | |
cd l-operator-demo | |
``` | |
2. **Install dependencies**: | |
```bash | |
pip install -r requirements.txt | |
``` | |
3. **Authenticate with Hugging Face**: | |
```bash | |
huggingface-cli login | |
``` | |
### Running the Demo | |
1. **Start the demo**: | |
```bash | |
python app.py | |
``` | |
2. **Open your browser** and navigate to `http://localhost:7860` | |
3. **Load the model** by clicking the "π Load L-Operator Model" button | |
4. **Upload an Android screenshot** and provide instructions | |
5. **Generate actions** or use the chat interface | |
## β‘ ZeroGPU Deployment | |
This demo is optimized for [Hugging Face Spaces ZeroGPU](https://huggingface.co/docs/hub/spaces-zerogpu), providing dynamic GPU allocation for cost-effective deployment. | |
### ZeroGPU Features | |
- **π Free GPU Access**: Dynamic NVIDIA H200 GPU allocation | |
- **β‘ On-Demand Resources**: GPUs allocated only when needed | |
- **π° Cost Efficient**: Optimized resource utilization | |
- **π Multi-GPU Support**: Leverage multiple GPUs concurrently | |
- **π‘οΈ Automatic Management**: Resources released after function completion | |
### ZeroGPU Specifications | |
| Specification | Value | | |
|---------------|-------| | |
| **GPU Type** | NVIDIA H200 slice | | |
| **Available VRAM** | 70GB per workload | | |
| **Supported Gradio** | 4+ | | |
| **Supported PyTorch** | 2.1.2, 2.2.2, 2.4.0, 2.5.1 | | |
| **Supported Python** | 3.10.13 | | |
| **Function Duration** | Up to 120 seconds per request | | |
### Deploying to Hugging Face Spaces | |
1. **Create a new Space** on Hugging Face: | |
- Choose **Gradio SDK** | |
- Select **ZeroGPU** in hardware options | |
- Upload your code | |
2. **Space Configuration**: | |
```yaml | |
# app.py is automatically detected | |
# requirements.txt is automatically installed | |
# ZeroGPU is automatically configured | |
``` | |
3. **Access Requirements**: | |
- **Personal accounts**: PRO subscription required | |
- **Organizations**: Enterprise Hub subscription required | |
- **Usage limits**: 10 Spaces (personal) / 50 Spaces (organization) | |
### ZeroGPU Integration Details | |
The demo automatically detects ZeroGPU availability and optimizes accordingly: | |
```python | |
# Automatic ZeroGPU detection | |
try: | |
import spaces | |
ZEROGPU_AVAILABLE = True | |
except ImportError: | |
ZEROGPU_AVAILABLE = False | |
# GPU-optimized functions | |
@spaces.GPU(duration=120) # 2 minutes for action generation | |
def generate_action(self, image, goal, instruction): | |
# GPU-accelerated inference | |
pass | |
@spaces.GPU(duration=90) # 1.5 minutes for chat responses | |
def chat_with_model(self, message, history, image): | |
# Interactive chat with GPU acceleration | |
pass | |
``` | |
## π― How to Use | |
### Basic Usage | |
1. **Load Model**: Click "π Load L-Operator Model" to initialize the model | |
2. **Upload Screenshot**: Upload an Android device screenshot | |
3. **Provide Instructions**: | |
- **Goal**: Describe what you want to achieve | |
- **Step**: Provide specific step instructions | |
4. **Generate Action**: Click "π― Generate Action" to get JSON output | |
### Chat Interface | |
1. **Upload Screenshot**: Upload an Android screenshot | |
2. **Send Message**: Use structured format: | |
``` | |
Goal: Open the Settings app and navigate to Display settings | |
Step: Tap on the Settings app icon on the home screen | |
``` | |
3. **Get Response**: The model will generate JSON actions | |
### Example Episodes | |
The demo includes pre-loaded examples from the training episodes: | |
- **Episode 13**: Cruise deals app navigation | |
- **Episode 53**: Pinterest search for sustainability art | |
- **Episode 73**: Moon phases app usage | |
## π Expected Output Format | |
The model generates JSON actions in the following format: | |
```json | |
{ | |
"action_type": "tap", | |
"x": 540, | |
"y": 1200, | |
"text": "Settings", | |
"app_name": "com.android.settings", | |
"confidence": 0.92 | |
} | |
``` | |
### Action Types | |
- `tap`: Tap at specific coordinates | |
- `click`: Click at specific coordinates | |
- `scroll`: Scroll in a direction (up/down/left/right) | |
- `input_text`: Input text | |
- `open_app`: Open a specific app | |
- `wait`: Wait for a moment | |
## π οΈ Technical Details | |
### Model Configuration | |
- **Device**: Automatically detects CUDA/CPU | |
- **Precision**: bfloat16 for CUDA, float32 for CPU | |
- **Generation**: Temperature 0.7, Top-p 0.9 | |
- **Max Tokens**: 128 for action generation | |
### Architecture | |
- **Base Model**: LFM2-VL-1.6B from LiquidAI | |
- **Fine-tuning**: LoRA with rank 16, alpha 32 | |
- **Target Modules**: q_proj, v_proj, fc1, fc2, linear, gate_proj, up_proj, down_proj | |
### Performance | |
- **Model Size**: ~1.6B parameters | |
- **Memory Usage**: ~4GB VRAM (CUDA) / ~8GB RAM (CPU) | |
- **Inference Speed**: Optimized for real-time use | |
- **Accuracy**: 98% action accuracy on test episodes | |
## π― Use Cases | |
### 1. Mobile App Testing | |
- Automated UI testing for Android applications | |
- Cross-device compatibility validation | |
- Regression testing with visual verification | |
### 2. Accessibility Applications | |
- Voice-controlled device navigation | |
- Assistive technology integration | |
- Screen reader enhancement tools | |
### 3. Remote Support | |
- Remote device troubleshooting | |
- Automated device configuration | |
- Support ticket automation | |
### 4. Development Workflows | |
- UI/UX testing automation | |
- User flow validation | |
- Performance testing integration | |
## β οΈ Important Notes | |
### Access Requirements | |
- **Investment Access**: This model is proprietary technology available exclusively to qualified investors under NDA | |
- **Authentication Required**: Must be authenticated with Hugging Face | |
- **Evaluation Only**: Access granted solely for investment evaluation purposes | |
- **Confidentiality**: All technical details are confidential | |
### ZeroGPU Limitations | |
- **Compatibility**: Currently exclusive to Gradio SDK | |
- **PyTorch Versions**: Limited to supported versions (2.1.2, 2.2.2, 2.4.0, 2.5.1) | |
- **Function Duration**: Maximum 60 seconds default, customizable up to 120 seconds | |
- **Queue Priority**: PRO users get x5 more daily usage and highest priority | |
### General Limitations | |
- **Market Hours**: Some features may be limited during market hours | |
- **Device Requirements**: Requires sufficient RAM/VRAM for model loading | |
- **Network**: Requires internet connection for model download | |
- **Authentication**: Must have approved access to the model | |
## π§ Troubleshooting | |
### Common Issues | |
1. **Model Loading Error**: | |
- Ensure you're authenticated: `huggingface-cli login` | |
- Check internet connection | |
- Verify model access approval | |
2. **Memory Issues**: | |
- Use CPU if GPU memory is insufficient | |
- Close other applications | |
- Consider using smaller batch sizes | |
3. **Authentication Errors**: | |
- Re-login to Hugging Face | |
- Check access approval status | |
- Contact support if issues persist | |
4. **ZeroGPU Issues**: | |
- Verify ZeroGPU is selected in Space settings | |
- Check PyTorch version compatibility | |
- Ensure function duration is within limits | |
### Performance Optimization | |
- **GPU Usage**: Use CUDA for faster inference | |
- **Memory Management**: Monitor VRAM usage | |
- **Batch Processing**: Process multiple images efficiently | |
- **ZeroGPU Optimization**: Specify appropriate function durations | |
## π Support | |
- **Investment Inquiries**: For investment-related questions and due diligence | |
- **Technical Support**: For technical issues with the demo | |
- **Model Access**: For access requests to the L-Operator model | |
- **ZeroGPU Support**: [ZeroGPU Documentation](https://huggingface.co/docs/hub/spaces-zerogpu) | |
## π License | |
This demo is provided under the same terms as the L-Operator model: | |
- **Proprietary Technology**: Owned by Tonic | |
- **Investment Evaluation**: Access granted solely for investment evaluation | |
- **NDA Required**: All access is subject to Non-Disclosure Agreement | |
- **No Commercial Use**: Without written consent | |
## π Acknowledgments | |
- **LiquidAI**: For the base LFM2-VL model | |
- **Hugging Face**: For the transformers library, hosting, and ZeroGPU infrastructure | |
- **Gradio**: For the excellent UI framework | |
## π Links | |
- [L-Operator Model](https://huggingface.co/Tonic/l-android-control) | |
- [Base Model (LFM2-VL-1.6B)](https://huggingface.co/LiquidAI/LFM2-VL-1.6B) | |
- [ZeroGPU Documentation](https://huggingface.co/docs/hub/spaces-zerogpu) | |
- [LiquidAI](https://liquid.ai/) | |
- [Tonic](https://tonic.ai/) | |
--- | |
**Made with β€οΈ by Tonic** | |