metadata

title: L Operator Demo
emoji: 📊
colorFrom: purple
colorTo: green
sdk: gradio
sdk_version: 5.44.0
app_file: app.py
pinned: true
license: gpl
short_description: demo of l-operator with no commands

Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference

🤖 L-Operator: Android Device Control Demo

A complete multimodal Gradio demo for the L-Operator model, a fine-tuned multimodal AI agent based on LiquidAI's LFM2-VL-1.6B model, optimized for Android device control through visual understanding and action generation.

🌟 Features

Multimodal Interface: Upload Android screenshots and provide text instructions
Chat Interface: Interactive chat with the model using Gradio's ChatInterface component
Action Generation: Generate JSON actions for Android device control
Example Episodes: Pre-loaded examples from extracted training episodes
Real-time Processing: Optimized for real-time inference
Beautiful UI: Modern, responsive interface with comprehensive documentation
⚡ ZeroGPU Compatible: Dynamic GPU allocation for cost-effective deployment

📋 Model Details

Property	Value
Base Model	LiquidAI/LFM2-VL-1.6B
Architecture	LFM2-VL (1.6B parameters)
Fine-tuning	LoRA (Low-Rank Adaptation)
Training Data	Android control episodes with screenshots and actions
License	Proprietary (Investment Access Required)

🚀 Quick Start

Prerequisites

Python 3.8+: Ensure you have Python 3.8 or higher installed
Hugging Face Access: Request access to the L-Operator model
Authentication: Login to Hugging Face using huggingface-cli login

Installation

Clone the repository:

git clone <repository-url>
cd l-operator-demo

Install dependencies:
```
pip install -r requirements.txt
```
Authenticate with Hugging Face:
```
huggingface-cli login
```

Running the Demo

Start the demo:
```
python app.py
```
Open your browser and navigate to http://localhost:7860
Load the model by clicking the "🚀 Load L-Operator Model" button
Upload an Android screenshot and provide instructions
Generate actions or use the chat interface

⚡ ZeroGPU Deployment

This demo is optimized for Hugging Face Spaces ZeroGPU, providing dynamic GPU allocation for cost-effective deployment.

ZeroGPU Features

🆓 Free GPU Access: Dynamic NVIDIA H200 GPU allocation
⚡ On-Demand Resources: GPUs allocated only when needed
💰 Cost Efficient: Optimized resource utilization
🔄 Multi-GPU Support: Leverage multiple GPUs concurrently
🛡️ Automatic Management: Resources released after function completion

ZeroGPU Specifications

Specification	Value
GPU Type	NVIDIA H200 slice
Available VRAM	70GB per workload
Supported Gradio	4+
Supported PyTorch	2.1.2, 2.2.2, 2.4.0, 2.5.1
Supported Python	3.10.13
Function Duration	Up to 120 seconds per request

Deploying to Hugging Face Spaces

Create a new Space on Hugging Face:
- Choose Gradio SDK
- Select ZeroGPU in hardware options
- Upload your code

Space Configuration:

# app.py is automatically detected
# requirements.txt is automatically installed
# ZeroGPU is automatically configured

Access Requirements:
- Personal accounts: PRO subscription required
- Organizations: Enterprise Hub subscription required
- Usage limits: 10 Spaces (personal) / 50 Spaces (organization)

ZeroGPU Integration Details

The demo automatically detects ZeroGPU availability and optimizes accordingly:

# Automatic ZeroGPU detection
try:
    import spaces
    ZEROGPU_AVAILABLE = True
except ImportError:
    ZEROGPU_AVAILABLE = False

# GPU-optimized functions
@spaces.GPU(duration=120)  # 2 minutes for action generation
def generate_action(self, image, goal, instruction):
    # GPU-accelerated inference
    pass

@spaces.GPU(duration=90)   # 1.5 minutes for chat responses
def chat_with_model(self, message, history, image):
    # Interactive chat with GPU acceleration
    pass

🎯 How to Use

Basic Usage

Load Model: Click "🚀 Load L-Operator Model" to initialize the model
Upload Screenshot: Upload an Android device screenshot
Provide Instructions:
- Goal: Describe what you want to achieve
- Step: Provide specific step instructions
Generate Action: Click "🎯 Generate Action" to get JSON output

Chat Interface

Upload Screenshot: Upload an Android screenshot

Send Message: Use structured format:

Goal: Open the Settings app and navigate to Display settings
Step: Tap on the Settings app icon on the home screen

Get Response: The model will generate JSON actions

Example Episodes

The demo includes pre-loaded examples from the training episodes:

Episode 13: Cruise deals app navigation
Episode 53: Pinterest search for sustainability art
Episode 73: Moon phases app usage

📊 Expected Output Format

The model generates JSON actions in the following format:

{
  "action_type": "tap",
  "x": 540,
  "y": 1200,
  "text": "Settings",
  "app_name": "com.android.settings",
  "confidence": 0.92
}

Action Types

tap: Tap at specific coordinates
click: Click at specific coordinates
scroll: Scroll in a direction (up/down/left/right)
input_text: Input text
open_app: Open a specific app
wait: Wait for a moment

🛠️ Technical Details

Model Configuration

Device: Automatically detects CUDA/CPU
Precision: bfloat16 for CUDA, float32 for CPU
Generation: Temperature 0.7, Top-p 0.9
Max Tokens: 128 for action generation

Architecture

Base Model: LFM2-VL-1.6B from LiquidAI
Fine-tuning: LoRA with rank 16, alpha 32
Target Modules: q_proj, v_proj, fc1, fc2, linear, gate_proj, up_proj, down_proj

Performance

Model Size: ~1.6B parameters
Memory Usage: ~4GB VRAM (CUDA) / ~8GB RAM (CPU)
Inference Speed: Optimized for real-time use
Accuracy: 98% action accuracy on test episodes

🎯 Use Cases

1. Mobile App Testing

Automated UI testing for Android applications
Cross-device compatibility validation
Regression testing with visual verification

2. Accessibility Applications

Voice-controlled device navigation
Assistive technology integration
Screen reader enhancement tools

3. Remote Support

Remote device troubleshooting
Automated device configuration
Support ticket automation

4. Development Workflows

UI/UX testing automation
User flow validation
Performance testing integration

⚠️ Important Notes

Access Requirements

Investment Access: This model is proprietary technology available exclusively to qualified investors under NDA
Authentication Required: Must be authenticated with Hugging Face
Evaluation Only: Access granted solely for investment evaluation purposes
Confidentiality: All technical details are confidential

ZeroGPU Limitations

Compatibility: Currently exclusive to Gradio SDK
PyTorch Versions: Limited to supported versions (2.1.2, 2.2.2, 2.4.0, 2.5.1)
Function Duration: Maximum 60 seconds default, customizable up to 120 seconds
Queue Priority: PRO users get x5 more daily usage and highest priority

General Limitations

Market Hours: Some features may be limited during market hours
Device Requirements: Requires sufficient RAM/VRAM for model loading
Network: Requires internet connection for model download
Authentication: Must have approved access to the model

🔧 Troubleshooting

Common Issues

Model Loading Error:
- Ensure you're authenticated: huggingface-cli login
- Check internet connection
- Verify model access approval
Memory Issues:
- Use CPU if GPU memory is insufficient
- Close other applications
- Consider using smaller batch sizes
Authentication Errors:
- Re-login to Hugging Face
- Check access approval status
- Contact support if issues persist
ZeroGPU Issues:
- Verify ZeroGPU is selected in Space settings
- Check PyTorch version compatibility
- Ensure function duration is within limits

Performance Optimization

GPU Usage: Use CUDA for faster inference
Memory Management: Monitor VRAM usage
Batch Processing: Process multiple images efficiently
ZeroGPU Optimization: Specify appropriate function durations

📞 Support

Investment Inquiries: For investment-related questions and due diligence
Technical Support: For technical issues with the demo
Model Access: For access requests to the L-Operator model
ZeroGPU Support: ZeroGPU Documentation

📄 License

This demo is provided under the same terms as the L-Operator model:

Proprietary Technology: Owned by Tonic
Investment Evaluation: Access granted solely for investment evaluation
NDA Required: All access is subject to Non-Disclosure Agreement
No Commercial Use: Without written consent

🙏 Acknowledgments

LiquidAI: For the base LFM2-VL model
Hugging Face: For the transformers library, hosting, and ZeroGPU infrastructure
Gradio: For the excellent UI framework

🔗 Links

Made with ❤️ by Tonic