pi05_libero_continuous_state

A pi0.5 (π₀.₅) Vision-Language-Action (VLA) model with continuous states, finetuned on the LIBERO robotic manipulation benchmark using the OpenTau training framework. This model is designed to follow natural language instructions to perform manipulation tasks in a simulated tabletop environment. The discrete states in PI05 have been swapped out for a continuous state token.

For full documentation, evaluation results, and inference code, please visit the repository:
👉 https://github.com/TensorAuto/OpenTau

Model Details

Description

Model Type: Vision-Language-Action (VLA) Model
Base Architecture: π₀.₅ (pi0.5) by Physical Intelligence (with continuous state)
Backbone: PaliGemma-3B (VLM) + Gemma-300M (Action Expert)
Training Data: LIBERO (Lifelong Robot Learning) Benchmark
Framework: OpenTau

Architecture

The pi0.5 architecture uses a flow-matching-based policy designed for open-world generalization. It combines a Visual Language Model (VLM) for high-level semantic understanding with a smaller "action expert" model that generates continuous joint trajectories (10-step action chunks) via flow matching. The discrete states are swapped out for a continuous state token.

Training and Evaluation

Dataset

This model was finetuned on the LIBERO benchmark dataset. The LIBERO suite consists of human-teleoperated demonstrations for tabletop manipulation, covering:

Spatial Generalization (libero_spatial)
Object Generalization (libero_object)
Goal Generalization (libero_goal)
Long-Horizon Tasks (libero_10)

Results

For detailed usage instructions, success rates, baseline comparisons, and evaluation protocols, please refer to the OpenTau GitHub Repository.

Downloads last month: -; Downloads are not tracked for this model. How to track

Video Preview

Robotics

TensorAuto
/

pi05_libero_continuous_state