l-operator-demo / README.md
Joseph Pollack
adds demo
23d4aef unverified
---
title: L Operator Demo
emoji: πŸ“Š
colorFrom: purple
colorTo: green
sdk: gradio
sdk_version: 5.44.0
app_file: app.py
pinned: true
license: gpl
short_description: demo of l-operator with no commands
---
Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference
# πŸ€– L-Operator: Android Device Control Demo
A complete multimodal Gradio demo for the [L-Operator model](https://huggingface.co/Tonic/l-android-control), a fine-tuned multimodal AI agent based on LiquidAI's LFM2-VL-1.6B model, optimized for Android device control through visual understanding and action generation.
## 🌟 Features
- **Multimodal Interface**: Upload Android screenshots and provide text instructions
- **Chat Interface**: Interactive chat with the model using Gradio's ChatInterface component
- **Action Generation**: Generate JSON actions for Android device control
- **Example Episodes**: Pre-loaded examples from extracted training episodes
- **Real-time Processing**: Optimized for real-time inference
- **Beautiful UI**: Modern, responsive interface with comprehensive documentation
- **⚑ ZeroGPU Compatible**: Dynamic GPU allocation for cost-effective deployment
## πŸ“‹ Model Details
| Property | Value |
|----------|-------|
| **Base Model** | [LiquidAI/LFM2-VL-1.6B](https://huggingface.co/LiquidAI/LFM2-VL-1.6B) |
| **Architecture** | LFM2-VL (1.6B parameters) |
| **Fine-tuning** | LoRA (Low-Rank Adaptation) |
| **Training Data** | Android control episodes with screenshots and actions |
| **License** | Proprietary (Investment Access Required) |
## πŸš€ Quick Start
### Prerequisites
1. **Python 3.8+**: Ensure you have Python 3.8 or higher installed
2. **Hugging Face Access**: Request access to the [L-Operator model](https://huggingface.co/Tonic/l-android-control)
3. **Authentication**: Login to Hugging Face using `huggingface-cli login`
### Installation
1. **Clone the repository**:
```bash
git clone <repository-url>
cd l-operator-demo
```
2. **Install dependencies**:
```bash
pip install -r requirements.txt
```
3. **Authenticate with Hugging Face**:
```bash
huggingface-cli login
```
### Running the Demo
1. **Start the demo**:
```bash
python app.py
```
2. **Open your browser** and navigate to `http://localhost:7860`
3. **Load the model** by clicking the "πŸš€ Load L-Operator Model" button
4. **Upload an Android screenshot** and provide instructions
5. **Generate actions** or use the chat interface
## ⚑ ZeroGPU Deployment
This demo is optimized for [Hugging Face Spaces ZeroGPU](https://huggingface.co/docs/hub/spaces-zerogpu), providing dynamic GPU allocation for cost-effective deployment.
### ZeroGPU Features
- **πŸ†“ Free GPU Access**: Dynamic NVIDIA H200 GPU allocation
- **⚑ On-Demand Resources**: GPUs allocated only when needed
- **πŸ’° Cost Efficient**: Optimized resource utilization
- **πŸ”„ Multi-GPU Support**: Leverage multiple GPUs concurrently
- **πŸ›‘οΈ Automatic Management**: Resources released after function completion
### ZeroGPU Specifications
| Specification | Value |
|---------------|-------|
| **GPU Type** | NVIDIA H200 slice |
| **Available VRAM** | 70GB per workload |
| **Supported Gradio** | 4+ |
| **Supported PyTorch** | 2.1.2, 2.2.2, 2.4.0, 2.5.1 |
| **Supported Python** | 3.10.13 |
| **Function Duration** | Up to 120 seconds per request |
### Deploying to Hugging Face Spaces
1. **Create a new Space** on Hugging Face:
- Choose **Gradio SDK**
- Select **ZeroGPU** in hardware options
- Upload your code
2. **Space Configuration**:
```yaml
# app.py is automatically detected
# requirements.txt is automatically installed
# ZeroGPU is automatically configured
```
3. **Access Requirements**:
- **Personal accounts**: PRO subscription required
- **Organizations**: Enterprise Hub subscription required
- **Usage limits**: 10 Spaces (personal) / 50 Spaces (organization)
### ZeroGPU Integration Details
The demo automatically detects ZeroGPU availability and optimizes accordingly:
```python
# Automatic ZeroGPU detection
try:
import spaces
ZEROGPU_AVAILABLE = True
except ImportError:
ZEROGPU_AVAILABLE = False
# GPU-optimized functions
@spaces.GPU(duration=120) # 2 minutes for action generation
def generate_action(self, image, goal, instruction):
# GPU-accelerated inference
pass
@spaces.GPU(duration=90) # 1.5 minutes for chat responses
def chat_with_model(self, message, history, image):
# Interactive chat with GPU acceleration
pass
```
## 🎯 How to Use
### Basic Usage
1. **Load Model**: Click "πŸš€ Load L-Operator Model" to initialize the model
2. **Upload Screenshot**: Upload an Android device screenshot
3. **Provide Instructions**:
- **Goal**: Describe what you want to achieve
- **Step**: Provide specific step instructions
4. **Generate Action**: Click "🎯 Generate Action" to get JSON output
### Chat Interface
1. **Upload Screenshot**: Upload an Android screenshot
2. **Send Message**: Use structured format:
```
Goal: Open the Settings app and navigate to Display settings
Step: Tap on the Settings app icon on the home screen
```
3. **Get Response**: The model will generate JSON actions
### Example Episodes
The demo includes pre-loaded examples from the training episodes:
- **Episode 13**: Cruise deals app navigation
- **Episode 53**: Pinterest search for sustainability art
- **Episode 73**: Moon phases app usage
## πŸ“Š Expected Output Format
The model generates JSON actions in the following format:
```json
{
"action_type": "tap",
"x": 540,
"y": 1200,
"text": "Settings",
"app_name": "com.android.settings",
"confidence": 0.92
}
```
### Action Types
- `tap`: Tap at specific coordinates
- `click`: Click at specific coordinates
- `scroll`: Scroll in a direction (up/down/left/right)
- `input_text`: Input text
- `open_app`: Open a specific app
- `wait`: Wait for a moment
## πŸ› οΈ Technical Details
### Model Configuration
- **Device**: Automatically detects CUDA/CPU
- **Precision**: bfloat16 for CUDA, float32 for CPU
- **Generation**: Temperature 0.7, Top-p 0.9
- **Max Tokens**: 128 for action generation
### Architecture
- **Base Model**: LFM2-VL-1.6B from LiquidAI
- **Fine-tuning**: LoRA with rank 16, alpha 32
- **Target Modules**: q_proj, v_proj, fc1, fc2, linear, gate_proj, up_proj, down_proj
### Performance
- **Model Size**: ~1.6B parameters
- **Memory Usage**: ~4GB VRAM (CUDA) / ~8GB RAM (CPU)
- **Inference Speed**: Optimized for real-time use
- **Accuracy**: 98% action accuracy on test episodes
## 🎯 Use Cases
### 1. Mobile App Testing
- Automated UI testing for Android applications
- Cross-device compatibility validation
- Regression testing with visual verification
### 2. Accessibility Applications
- Voice-controlled device navigation
- Assistive technology integration
- Screen reader enhancement tools
### 3. Remote Support
- Remote device troubleshooting
- Automated device configuration
- Support ticket automation
### 4. Development Workflows
- UI/UX testing automation
- User flow validation
- Performance testing integration
## ⚠️ Important Notes
### Access Requirements
- **Investment Access**: This model is proprietary technology available exclusively to qualified investors under NDA
- **Authentication Required**: Must be authenticated with Hugging Face
- **Evaluation Only**: Access granted solely for investment evaluation purposes
- **Confidentiality**: All technical details are confidential
### ZeroGPU Limitations
- **Compatibility**: Currently exclusive to Gradio SDK
- **PyTorch Versions**: Limited to supported versions (2.1.2, 2.2.2, 2.4.0, 2.5.1)
- **Function Duration**: Maximum 60 seconds default, customizable up to 120 seconds
- **Queue Priority**: PRO users get x5 more daily usage and highest priority
### General Limitations
- **Market Hours**: Some features may be limited during market hours
- **Device Requirements**: Requires sufficient RAM/VRAM for model loading
- **Network**: Requires internet connection for model download
- **Authentication**: Must have approved access to the model
## πŸ”§ Troubleshooting
### Common Issues
1. **Model Loading Error**:
- Ensure you're authenticated: `huggingface-cli login`
- Check internet connection
- Verify model access approval
2. **Memory Issues**:
- Use CPU if GPU memory is insufficient
- Close other applications
- Consider using smaller batch sizes
3. **Authentication Errors**:
- Re-login to Hugging Face
- Check access approval status
- Contact support if issues persist
4. **ZeroGPU Issues**:
- Verify ZeroGPU is selected in Space settings
- Check PyTorch version compatibility
- Ensure function duration is within limits
### Performance Optimization
- **GPU Usage**: Use CUDA for faster inference
- **Memory Management**: Monitor VRAM usage
- **Batch Processing**: Process multiple images efficiently
- **ZeroGPU Optimization**: Specify appropriate function durations
## πŸ“ž Support
- **Investment Inquiries**: For investment-related questions and due diligence
- **Technical Support**: For technical issues with the demo
- **Model Access**: For access requests to the L-Operator model
- **ZeroGPU Support**: [ZeroGPU Documentation](https://huggingface.co/docs/hub/spaces-zerogpu)
## πŸ“„ License
This demo is provided under the same terms as the L-Operator model:
- **Proprietary Technology**: Owned by Tonic
- **Investment Evaluation**: Access granted solely for investment evaluation
- **NDA Required**: All access is subject to Non-Disclosure Agreement
- **No Commercial Use**: Without written consent
## πŸ™ Acknowledgments
- **LiquidAI**: For the base LFM2-VL model
- **Hugging Face**: For the transformers library, hosting, and ZeroGPU infrastructure
- **Gradio**: For the excellent UI framework
## πŸ”— Links
- [L-Operator Model](https://huggingface.co/Tonic/l-android-control)
- [Base Model (LFM2-VL-1.6B)](https://huggingface.co/LiquidAI/LFM2-VL-1.6B)
- [ZeroGPU Documentation](https://huggingface.co/docs/hub/spaces-zerogpu)
- [LiquidAI](https://liquid.ai/)
- [Tonic](https://tonic.ai/)
---
**Made with ❀️ by Tonic**