Spaces:
Running
on
Zero
A newer version of the Gradio SDK is available:
5.44.1
title: L Operator Demo
emoji: π
colorFrom: purple
colorTo: green
sdk: gradio
sdk_version: 5.44.0
app_file: app.py
pinned: true
license: gpl
short_description: demo of l-operator with no commands
Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference
π€ L-Operator: Android Device Control Demo
A complete multimodal Gradio demo for the L-Operator model, a fine-tuned multimodal AI agent based on LiquidAI's LFM2-VL-1.6B model, optimized for Android device control through visual understanding and action generation.
π Features
- Multimodal Interface: Upload Android screenshots and provide text instructions
- Chat Interface: Interactive chat with the model using Gradio's ChatInterface component
- Action Generation: Generate JSON actions for Android device control
- Example Episodes: Pre-loaded examples from extracted training episodes
- Real-time Processing: Optimized for real-time inference
- Beautiful UI: Modern, responsive interface with comprehensive documentation
- β‘ ZeroGPU Compatible: Dynamic GPU allocation for cost-effective deployment
π Model Details
Property | Value |
---|---|
Base Model | LiquidAI/LFM2-VL-1.6B |
Architecture | LFM2-VL (1.6B parameters) |
Fine-tuning | LoRA (Low-Rank Adaptation) |
Training Data | Android control episodes with screenshots and actions |
License | Proprietary (Investment Access Required) |
π Quick Start
Prerequisites
- Python 3.8+: Ensure you have Python 3.8 or higher installed
- Hugging Face Access: Request access to the L-Operator model
- Authentication: Login to Hugging Face using
huggingface-cli login
Installation
Clone the repository:
git clone <repository-url> cd l-operator-demo
Install dependencies:
pip install -r requirements.txt
Authenticate with Hugging Face:
huggingface-cli login
Running the Demo
Start the demo:
python app.py
Open your browser and navigate to
http://localhost:7860
Load the model by clicking the "π Load L-Operator Model" button
Upload an Android screenshot and provide instructions
Generate actions or use the chat interface
β‘ ZeroGPU Deployment
This demo is optimized for Hugging Face Spaces ZeroGPU, providing dynamic GPU allocation for cost-effective deployment.
ZeroGPU Features
- π Free GPU Access: Dynamic NVIDIA H200 GPU allocation
- β‘ On-Demand Resources: GPUs allocated only when needed
- π° Cost Efficient: Optimized resource utilization
- π Multi-GPU Support: Leverage multiple GPUs concurrently
- π‘οΈ Automatic Management: Resources released after function completion
ZeroGPU Specifications
Specification | Value |
---|---|
GPU Type | NVIDIA H200 slice |
Available VRAM | 70GB per workload |
Supported Gradio | 4+ |
Supported PyTorch | 2.1.2, 2.2.2, 2.4.0, 2.5.1 |
Supported Python | 3.10.13 |
Function Duration | Up to 120 seconds per request |
Deploying to Hugging Face Spaces
Create a new Space on Hugging Face:
- Choose Gradio SDK
- Select ZeroGPU in hardware options
- Upload your code
Space Configuration:
# app.py is automatically detected # requirements.txt is automatically installed # ZeroGPU is automatically configured
Access Requirements:
- Personal accounts: PRO subscription required
- Organizations: Enterprise Hub subscription required
- Usage limits: 10 Spaces (personal) / 50 Spaces (organization)
ZeroGPU Integration Details
The demo automatically detects ZeroGPU availability and optimizes accordingly:
# Automatic ZeroGPU detection
try:
import spaces
ZEROGPU_AVAILABLE = True
except ImportError:
ZEROGPU_AVAILABLE = False
# GPU-optimized functions
@spaces.GPU(duration=120) # 2 minutes for action generation
def generate_action(self, image, goal, instruction):
# GPU-accelerated inference
pass
@spaces.GPU(duration=90) # 1.5 minutes for chat responses
def chat_with_model(self, message, history, image):
# Interactive chat with GPU acceleration
pass
π― How to Use
Basic Usage
- Load Model: Click "π Load L-Operator Model" to initialize the model
- Upload Screenshot: Upload an Android device screenshot
- Provide Instructions:
- Goal: Describe what you want to achieve
- Step: Provide specific step instructions
- Generate Action: Click "π― Generate Action" to get JSON output
Chat Interface
- Upload Screenshot: Upload an Android screenshot
- Send Message: Use structured format:
Goal: Open the Settings app and navigate to Display settings Step: Tap on the Settings app icon on the home screen
- Get Response: The model will generate JSON actions
Example Episodes
The demo includes pre-loaded examples from the training episodes:
- Episode 13: Cruise deals app navigation
- Episode 53: Pinterest search for sustainability art
- Episode 73: Moon phases app usage
π Expected Output Format
The model generates JSON actions in the following format:
{
"action_type": "tap",
"x": 540,
"y": 1200,
"text": "Settings",
"app_name": "com.android.settings",
"confidence": 0.92
}
Action Types
tap
: Tap at specific coordinatesclick
: Click at specific coordinatesscroll
: Scroll in a direction (up/down/left/right)input_text
: Input textopen_app
: Open a specific appwait
: Wait for a moment
π οΈ Technical Details
Model Configuration
- Device: Automatically detects CUDA/CPU
- Precision: bfloat16 for CUDA, float32 for CPU
- Generation: Temperature 0.7, Top-p 0.9
- Max Tokens: 128 for action generation
Architecture
- Base Model: LFM2-VL-1.6B from LiquidAI
- Fine-tuning: LoRA with rank 16, alpha 32
- Target Modules: q_proj, v_proj, fc1, fc2, linear, gate_proj, up_proj, down_proj
Performance
- Model Size: ~1.6B parameters
- Memory Usage: ~4GB VRAM (CUDA) / ~8GB RAM (CPU)
- Inference Speed: Optimized for real-time use
- Accuracy: 98% action accuracy on test episodes
π― Use Cases
1. Mobile App Testing
- Automated UI testing for Android applications
- Cross-device compatibility validation
- Regression testing with visual verification
2. Accessibility Applications
- Voice-controlled device navigation
- Assistive technology integration
- Screen reader enhancement tools
3. Remote Support
- Remote device troubleshooting
- Automated device configuration
- Support ticket automation
4. Development Workflows
- UI/UX testing automation
- User flow validation
- Performance testing integration
β οΈ Important Notes
Access Requirements
- Investment Access: This model is proprietary technology available exclusively to qualified investors under NDA
- Authentication Required: Must be authenticated with Hugging Face
- Evaluation Only: Access granted solely for investment evaluation purposes
- Confidentiality: All technical details are confidential
ZeroGPU Limitations
- Compatibility: Currently exclusive to Gradio SDK
- PyTorch Versions: Limited to supported versions (2.1.2, 2.2.2, 2.4.0, 2.5.1)
- Function Duration: Maximum 60 seconds default, customizable up to 120 seconds
- Queue Priority: PRO users get x5 more daily usage and highest priority
General Limitations
- Market Hours: Some features may be limited during market hours
- Device Requirements: Requires sufficient RAM/VRAM for model loading
- Network: Requires internet connection for model download
- Authentication: Must have approved access to the model
π§ Troubleshooting
Common Issues
Model Loading Error:
- Ensure you're authenticated:
huggingface-cli login
- Check internet connection
- Verify model access approval
- Ensure you're authenticated:
Memory Issues:
- Use CPU if GPU memory is insufficient
- Close other applications
- Consider using smaller batch sizes
Authentication Errors:
- Re-login to Hugging Face
- Check access approval status
- Contact support if issues persist
ZeroGPU Issues:
- Verify ZeroGPU is selected in Space settings
- Check PyTorch version compatibility
- Ensure function duration is within limits
Performance Optimization
- GPU Usage: Use CUDA for faster inference
- Memory Management: Monitor VRAM usage
- Batch Processing: Process multiple images efficiently
- ZeroGPU Optimization: Specify appropriate function durations
π Support
- Investment Inquiries: For investment-related questions and due diligence
- Technical Support: For technical issues with the demo
- Model Access: For access requests to the L-Operator model
- ZeroGPU Support: ZeroGPU Documentation
π License
This demo is provided under the same terms as the L-Operator model:
- Proprietary Technology: Owned by Tonic
- Investment Evaluation: Access granted solely for investment evaluation
- NDA Required: All access is subject to Non-Disclosure Agreement
- No Commercial Use: Without written consent
π Acknowledgments
- LiquidAI: For the base LFM2-VL model
- Hugging Face: For the transformers library, hosting, and ZeroGPU infrastructure
- Gradio: For the excellent UI framework
π Links
Made with β€οΈ by Tonic