File size: 3,204 Bytes
9f64c78 ec2183a 7095495 ec2183a 5346663 dd0c1a4 a455ac7 dd0c1a4 7a81fb6 dd0c1a4 7a81fb6 ef6a2e5 dd0c1a4 9f64c78 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 |
---
license: apache-2.0
library_name: diffusers
---
<h1 align="center">
RoboTransfer: Geometry-Consistent Video Diffusion for Robotic Visual Policy Transfer
</h1>
<div align="center" class="authors">
Liu Liu,
Xiaofeng Wang,
Guosheng Zhao,
Keyu Li,
Wenkang Qin,
Jiaxiong Qiu,
Zheng Zhu,
Guan Huang,
Zhizhong Su
</div>
<div align="center" style="line-height: 3;">
<a href="https://github.com/HorizonRobotics/RoboTransfer" target="_blank" style="margin: 2px;">
<img alt="Code" src="https://img.shields.io/badge/Code-Github-blue" style="display: inline-block; vertical-align: middle;"/>
</a>
<a href="https://horizonrobotics.github.io/robot_lab/robotransfer" target="_blank" style="margin: 2px;">
<img alt="Project Page" src="https://img.shields.io/badge/π-Project_Page-blue" style="display: inline-block; vertical-align: middle;"/>
</a>
<a href="https://arxiv.org/abs/2505.23171" target="_blank" style="margin: 2px;">
<img alt="arXiv" src="https://img.shields.io/badge/π-arXiv-b31b1b" style="display: inline-block; vertical-align: middle;"/>
</a>
<a href="https://youtu.be/dGXKtqDnm5Q" target="_blank" style="margin: 2px;">
<img alt="Video" src="https://img.shields.io/badge/π₯-Video-red" style="display: inline-block; vertical-align: middle;"/>
</a>
<a href="https://mp.weixin.qq.com/s/c9-1HPBMHIy4oEwyKnsT7Q" target="_blank" style="margin: 2px;">
<img alt="δΈζδ»η»" src="https://img.shields.io/badge/δΈζδ»η»-07C160?logo=wechat&logoColor=white" style="display: inline-block; vertical-align: middle;"/>
</a>
</div>
<div align="center">
<img src="assets/pin.jpg" width="40%" alt="RoboTransfer"/></div>
---
## π Abstract

**RoboTransfer** is a novel diffusion-based video generation framework tailored for robotic visual policy transfer. Unlike conventional approaches, RoboTransfer introduces **geometry-aware synthesis** by injecting **depth and normal priors**, ensuring multi-view consistency across dynamic robotic scenes. The method further supports **explicit control over scene components**, such as **background editing**, **object identity swapping**, and **motion specification**, offering a fine-grained video generation pipeline that benefits embodied learning.
---
## π§ Key Features
- π **Geometry-Consistent Diffusion**: Injects global 3D cues (depth, normal) and cross-view interactions for multi-view realism.
- π§© **Scene Component Control**: Enables manipulation of object attributes (pose, identity) and background features.
- π **Cross-View Conditioning**: Learns representations from multiple camera views with spatial correspondence.
- π€ **Robotic Policy Transfer**: Facilitates domain adaptation by generating synthetic training data in target domains.
---
## π BibTeX
```bibtex
@article{liu2025robotransfer,
title={RoboTransfer: Geometry-Consistent Video Diffusion for Robotic Visual Policy Transfer},
author={Liu, Liu and Wang, Xiaofeng and Zhao, Guosheng and Li, Keyu and Qin, Wenkang and Qiu, Jiaxiong and Zhu, Zheng and Huang, Guan and Su, Zhizhong},
journal={arXiv preprint arXiv:2505.23171},
year={2025}
}
|