LIBERO-Plus: In-depth Robustness Analysis of Vision-Language-Action Models
π Paper | ποΈ Repo | π Website
π₯ Overview
This repository contains the official implementation and benchmark for our paper "In-depth Robustness Analysis for Vision-Language-Action Models". We systematically expose the hidden vulnerabilities of contemporary VLA models through comprehensive robustness evaluation across seven perturbation dimensions. You can simply replace the original libero with a pip install -e . without modifying your code.
π Key Findings
- Significant Fragility: VLA models exhibit extreme sensitivity to camera viewpoints and robot initial states, with performance dropping from 95% to below 30% under modest perturbations
- Language Ignorance: Models largely ignore language instructions, functioning more like Vision-Action models
- Negative Compositional Generalization: Combined perturbations reveal complex interaction effects beyond independent factors
π LIBERO-plus Benchmark
7 Perturbation Dimensions
We introduce LIBERO-plus, a comprehensive benchmark with 10,030 tasks spanning:
- Objects Layout - Confounding objects and target object displacement
- Camera Viewpoints - Position, orientation, and field-of-view changes
- Robot Initial States - Manipulator initial pose variations
- Language Instructions - LLM-based instruction rewriting
- Light Conditions - Intensity, direction, color, and shadow variations
- Background Textures - Scene and surface appearance changes
- Sensor Noise - Photometric distortions and image degradation
Evaluated Models
- OpenVLA and variants (OFT, OFT_w, OFT_m)
- Οβ and Οβ-fast
- Nora, WorldVLA, UniVLA, RIPT-VLA
π οΈ Installation
Please refer to our github repo for more installation details. You can download our OpenVLA-OFT weights after mix-SFT from this hf repo. You can also find the assets and the training dataset.
The extracted directory structure should look like:
LIBERO-plus/
βββ libero/
    βββ libero/
        βββ assets/
            βββ articulated_objects/
            βββ new_objects/
            βββ scenes/
            βββ stable_hope_objects/
            βββ stable_scanned_objects/
            βββ textures/
            βββ turbosquid_objects/
            βββ serving_region.xml
            βββ wall_frames.stl
            βββ wall.xml
π§ Evaluation
The evaluation method is almost identical to LIBERO. The only required modification is adjusting num_trials_per_task from 50 to 1 in your configuration.
π LIBERO-Plus Benchmark Leaderboard
| Model | Camera | Robot | Language | Light | Background | Noise | Layout | Total | 
|---|---|---|---|---|---|---|---|---|
| OpenVLA | 0.8 | 3.5 | 23.0 | 8.1 | 50.4 | 15.2 | 28.5 | 17.3 | 
| OpenVLA-OFT | 56.4 | 31.9 | 79.5 | 88.7 | 97.3 | 75.8 | 74.2 | 70.0 | 
| OpenVLA-OFT_w | 10.4 | 38.7 | 70.5 | 76.8 | 99.2 | 49.9 | 69.9 | 56.4 | 
| NORA | 2.2 | 37.0 | 65.1 | 45.7 | 65.5 | 12.8 | 62.1 | 39.8 | 
| WorldVLA | 0.1 | 27.9 | 41.6 | 43.7 | 19.8 | 10.9 | 38.0 | 25.3 | 
| UniVLA | 1.8 | 46.2 | 69.6 | 69.0 | 90.7 | 21.2 | 31.9 | 43.9 | 
| Οβ | 13.8 | 6.0 | 58.8 | 85.0 | 90.7 | 79.0 | 68.9 | 54.6 | 
| Οβ-Fast | 65.1 | 21.6 | 61.0 | 73.2 | 97.7 | 74.4 | 68.8 | 64.2 | 
| RIPT-VLA | 55.2 | 31.2 | 77.6 | 88.4 | 100.0 | 73.5 | 74.2 | 69.3 | 
| OpenVLA-OFT_m | 55.6 | 21.7 | 81.0 | 92.7 | 92.3 | 78.6 | 68.7 | 68.1 | 
| OpenVLA-OFT+ (Ours) | 92.8 | 30.3 | 85.8 | 94.9 | 93.9 | 89.3 | 77.6 | 79.6 | 
- OpenVLA-OFT+ shows the performance of OpenVLA-OFT with a mix-sft on LIBERO-plus dataset.
- OpenVLA-OFT_w shows the performance of OpenVLA-OFT without wrist observation input.
- OpenVLA-OFT_m shows the performance of OpenVLA-OFT with a mix-sft.
Origin LIBERO Benchmark Leaderboard
To make it easier to get all the results in one place, we've compiled the evaluation results of current VLA models on the original LIBERO benchmark in this table.
Citation
If you find this work useful for your research, please cite our paper:
@article{fei25libero-plus,
    title={LIBERO-Plus: In-depth Robustness Analysis of Vision-Language-Action Models},
    author={Senyu Fei and Siyin Wang and Junhao Shi and Zihao Dai and Jikun Cai and Pengfang Qian and Li Ji and Xinzhe He and Shiduo Zhang and Zhaoye Fei and Jinlan Fu and Jingjing Gong and Xipeng Qiu},
    journal = {arXiv preprint arXiv:2510.13626},
    year={2025},
}
- Downloads last month
- 170