Spaces:

Tonic
/

l-operator-demo

Running on Zero

File size: 10,237 Bytes

d5b2cea
 
 
 
 
 
 
 
23d4aef
d5b2cea
 
 
 
 
23d4aef

---
title: L Operator Demo
emoji: 📊
colorFrom: purple
colorTo: green
sdk: gradio
sdk_version: 5.44.0
app_file: app.py
pinned: true
license: gpl
short_description: demo of l-operator with no commands
---

Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference

# 🤖 L-Operator: Android Device Control Demo

A complete multimodal Gradio demo for the [L-Operator model](https://huggingface.co/Tonic/l-android-control), a fine-tuned multimodal AI agent based on LiquidAI's LFM2-VL-1.6B model, optimized for Android device control through visual understanding and action generation.

## 🌟 Features

- **Multimodal Interface**: Upload Android screenshots and provide text instructions
- **Chat Interface**: Interactive chat with the model using Gradio's ChatInterface component
- **Action Generation**: Generate JSON actions for Android device control
- **Example Episodes**: Pre-loaded examples from extracted training episodes
- **Real-time Processing**: Optimized for real-time inference
- **Beautiful UI**: Modern, responsive interface with comprehensive documentation
- **⚡ ZeroGPU Compatible**: Dynamic GPU allocation for cost-effective deployment

## 📋 Model Details

| Property | Value |
|----------|-------|
| **Base Model** | [LiquidAI/LFM2-VL-1.6B](https://huggingface.co/LiquidAI/LFM2-VL-1.6B) |
| **Architecture** | LFM2-VL (1.6B parameters) |
| **Fine-tuning** | LoRA (Low-Rank Adaptation) |
| **Training Data** | Android control episodes with screenshots and actions |
| **License** | Proprietary (Investment Access Required) |

## 🚀 Quick Start

### Prerequisites

1. **Python 3.8+**: Ensure you have Python 3.8 or higher installed
2. **Hugging Face Access**: Request access to the [L-Operator model](https://huggingface.co/Tonic/l-android-control)
3. **Authentication**: Login to Hugging Face using `huggingface-cli login`

### Installation

1. **Clone the repository**:
   ```bash
   git clone <repository-url>
   cd l-operator-demo
   ```

2. **Install dependencies**:
   ```bash
   pip install -r requirements.txt
   ```

3. **Authenticate with Hugging Face**:
   ```bash
   huggingface-cli login
   ```

### Running the Demo

1. **Start the demo**:
   ```bash
   python app.py
   ```

2. **Open your browser** and navigate to `http://localhost:7860`

3. **Load the model** by clicking the "🚀 Load L-Operator Model" button

4. **Upload an Android screenshot** and provide instructions

5. **Generate actions** or use the chat interface

## ⚡ ZeroGPU Deployment

This demo is optimized for [Hugging Face Spaces ZeroGPU](https://huggingface.co/docs/hub/spaces-zerogpu), providing dynamic GPU allocation for cost-effective deployment.

### ZeroGPU Features

- **🆓 Free GPU Access**: Dynamic NVIDIA H200 GPU allocation
- **⚡ On-Demand Resources**: GPUs allocated only when needed
- **💰 Cost Efficient**: Optimized resource utilization
- **🔄 Multi-GPU Support**: Leverage multiple GPUs concurrently
- **🛡️ Automatic Management**: Resources released after function completion

### ZeroGPU Specifications

| Specification | Value |
|---------------|-------|
| **GPU Type** | NVIDIA H200 slice |
| **Available VRAM** | 70GB per workload |
| **Supported Gradio** | 4+ |
| **Supported PyTorch** | 2.1.2, 2.2.2, 2.4.0, 2.5.1 |
| **Supported Python** | 3.10.13 |
| **Function Duration** | Up to 120 seconds per request |

### Deploying to Hugging Face Spaces

1. **Create a new Space** on Hugging Face:
   - Choose **Gradio SDK**
   - Select **ZeroGPU** in hardware options
   - Upload your code

2. **Space Configuration**:
   ```yaml
   # app.py is automatically detected
   # requirements.txt is automatically installed
   # ZeroGPU is automatically configured
   ```

3. **Access Requirements**:
   - **Personal accounts**: PRO subscription required
   - **Organizations**: Enterprise Hub subscription required
   - **Usage limits**: 10 Spaces (personal) / 50 Spaces (organization)

### ZeroGPU Integration Details

The demo automatically detects ZeroGPU availability and optimizes accordingly:

```python
# Automatic ZeroGPU detection
try:
    import spaces
    ZEROGPU_AVAILABLE = True
except ImportError:
    ZEROGPU_AVAILABLE = False

# GPU-optimized functions
@spaces.GPU(duration=120)  # 2 minutes for action generation
def generate_action(self, image, goal, instruction):
    # GPU-accelerated inference
    pass

@spaces.GPU(duration=90)   # 1.5 minutes for chat responses
def chat_with_model(self, message, history, image):
    # Interactive chat with GPU acceleration
    pass
```

## 🎯 How to Use

### Basic Usage

1. **Load Model**: Click "🚀 Load L-Operator Model" to initialize the model
2. **Upload Screenshot**: Upload an Android device screenshot
3. **Provide Instructions**: 
   - **Goal**: Describe what you want to achieve
   - **Step**: Provide specific step instructions
4. **Generate Action**: Click "🎯 Generate Action" to get JSON output

### Chat Interface

1. **Upload Screenshot**: Upload an Android screenshot
2. **Send Message**: Use structured format:
   ```
   Goal: Open the Settings app and navigate to Display settings
   Step: Tap on the Settings app icon on the home screen
   ```
3. **Get Response**: The model will generate JSON actions

### Example Episodes

The demo includes pre-loaded examples from the training episodes:

- **Episode 13**: Cruise deals app navigation
- **Episode 53**: Pinterest search for sustainability art
- **Episode 73**: Moon phases app usage

## 📊 Expected Output Format

The model generates JSON actions in the following format:

```json
{
  "action_type": "tap",
  "x": 540,
  "y": 1200,
  "text": "Settings",
  "app_name": "com.android.settings",
  "confidence": 0.92
}
```

### Action Types

- `tap`: Tap at specific coordinates
- `click`: Click at specific coordinates
- `scroll`: Scroll in a direction (up/down/left/right)
- `input_text`: Input text
- `open_app`: Open a specific app
- `wait`: Wait for a moment

## 🛠️ Technical Details

### Model Configuration

- **Device**: Automatically detects CUDA/CPU
- **Precision**: bfloat16 for CUDA, float32 for CPU
- **Generation**: Temperature 0.7, Top-p 0.9
- **Max Tokens**: 128 for action generation

### Architecture

- **Base Model**: LFM2-VL-1.6B from LiquidAI
- **Fine-tuning**: LoRA with rank 16, alpha 32
- **Target Modules**: q_proj, v_proj, fc1, fc2, linear, gate_proj, up_proj, down_proj

### Performance

- **Model Size**: ~1.6B parameters
- **Memory Usage**: ~4GB VRAM (CUDA) / ~8GB RAM (CPU)
- **Inference Speed**: Optimized for real-time use
- **Accuracy**: 98% action accuracy on test episodes

## 🎯 Use Cases

### 1. Mobile App Testing
- Automated UI testing for Android applications
- Cross-device compatibility validation
- Regression testing with visual verification

### 2. Accessibility Applications
- Voice-controlled device navigation
- Assistive technology integration
- Screen reader enhancement tools

### 3. Remote Support
- Remote device troubleshooting
- Automated device configuration
- Support ticket automation

### 4. Development Workflows
- UI/UX testing automation
- User flow validation
- Performance testing integration

## ⚠️ Important Notes

### Access Requirements

- **Investment Access**: This model is proprietary technology available exclusively to qualified investors under NDA
- **Authentication Required**: Must be authenticated with Hugging Face
- **Evaluation Only**: Access granted solely for investment evaluation purposes
- **Confidentiality**: All technical details are confidential

### ZeroGPU Limitations

- **Compatibility**: Currently exclusive to Gradio SDK
- **PyTorch Versions**: Limited to supported versions (2.1.2, 2.2.2, 2.4.0, 2.5.1)
- **Function Duration**: Maximum 60 seconds default, customizable up to 120 seconds
- **Queue Priority**: PRO users get x5 more daily usage and highest priority

### General Limitations

- **Market Hours**: Some features may be limited during market hours
- **Device Requirements**: Requires sufficient RAM/VRAM for model loading
- **Network**: Requires internet connection for model download
- **Authentication**: Must have approved access to the model

## 🔧 Troubleshooting

### Common Issues

1. **Model Loading Error**:
   - Ensure you're authenticated: `huggingface-cli login`
   - Check internet connection
   - Verify model access approval

2. **Memory Issues**:
   - Use CPU if GPU memory is insufficient
   - Close other applications
   - Consider using smaller batch sizes

3. **Authentication Errors**:
   - Re-login to Hugging Face
   - Check access approval status
   - Contact support if issues persist

4. **ZeroGPU Issues**:
   - Verify ZeroGPU is selected in Space settings
   - Check PyTorch version compatibility
   - Ensure function duration is within limits

### Performance Optimization

- **GPU Usage**: Use CUDA for faster inference
- **Memory Management**: Monitor VRAM usage
- **Batch Processing**: Process multiple images efficiently
- **ZeroGPU Optimization**: Specify appropriate function durations

## 📞 Support

- **Investment Inquiries**: For investment-related questions and due diligence
- **Technical Support**: For technical issues with the demo
- **Model Access**: For access requests to the L-Operator model
- **ZeroGPU Support**: [ZeroGPU Documentation](https://huggingface.co/docs/hub/spaces-zerogpu)

## 📄 License

This demo is provided under the same terms as the L-Operator model:

- **Proprietary Technology**: Owned by Tonic
- **Investment Evaluation**: Access granted solely for investment evaluation
- **NDA Required**: All access is subject to Non-Disclosure Agreement
- **No Commercial Use**: Without written consent

## 🙏 Acknowledgments

- **LiquidAI**: For the base LFM2-VL model
- **Hugging Face**: For the transformers library, hosting, and ZeroGPU infrastructure
- **Gradio**: For the excellent UI framework

## 🔗 Links

- [L-Operator Model](https://huggingface.co/Tonic/l-android-control)
- [Base Model (LFM2-VL-1.6B)](https://huggingface.co/LiquidAI/LFM2-VL-1.6B)
- [ZeroGPU Documentation](https://huggingface.co/docs/hub/spaces-zerogpu)
- [LiquidAI](https://liquid.ai/)
- [Tonic](https://tonic.ai/)

---

**Made with ❤️ by Tonic**