|
--- |
|
license: apache-2.0 |
|
library_name: transformers |
|
pipeline_tag: robotics |
|
--- |
|
|
|
# π UniVLA |
|
> This is the official checkpoint of our RSS 2025 work: **Learning to Act Anywhere with Task-centric Latent Actions** |
|
|
|
#### Paper: https://arxiv.org/pdf/2505.06111 |
|
#### Code: https://github.com/OpenDriveLab/UniVLA |
|
|
|
## π₯ Highlights |
|
- A recipe towards generalist policy by planning in a unified, embodiment-agnostic action space. |
|
- A novel approach for extracting task-centric latent actions from cross-embodiment videos. |
|
- A VLA that achieves state-of-the-art results on multiple benchmarks with compute-efficient training. |
|
|
|
## How to use |
|
|
|
This is the UniVLA pre-trained on our full data collection (OpenX + Ego4D). For finetuning on simulation benchmarks or your customized dataset, please visit our [official repo](https://github.com/OpenDriveLab/UniVLA). |
|
|
|
## π Citation |
|
If you find our code or models useful in your work, please cite [our paper](https://arxiv.org/pdf/2505.06111): |
|
|
|
```bibtex |
|
@article{bu2025univla, |
|
title={Univla: Learning to act anywhere with task-centric latent actions}, |
|
author={Bu, Qingwen and Yang, Yanting and Cai, Jisong and Gao, Shenyuan and Ren, Guanghui and Yao, Maoqing and Luo, Ping and Li, Hongyang}, |
|
journal={arXiv preprint arXiv:2505.06111}, |
|
year={2025} |
|
} |
|
``` |