Diffusers
Safetensors
RoboTransferPipeline
File size: 3,204 Bytes
9f64c78
 
 
 
 
ec2183a
7095495
ec2183a
 
5346663
dd0c1a4
 
 
 
 
 
 
 
 
 
 
 
 
a455ac7
dd0c1a4
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
7a81fb6
dd0c1a4
 
 
 
 
7a81fb6
ef6a2e5
dd0c1a4
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
9f64c78
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
---
license: apache-2.0
library_name: diffusers
---

<h1 align="center">
  RoboTransfer: Geometry-Consistent Video Diffusion for Robotic Visual Policy Transfer
</h1>


<div align="center" class="authors">
  Liu Liu,
  Xiaofeng Wang,
  Guosheng Zhao,
  Keyu Li,
  Wenkang Qin,
  Jiaxiong Qiu,
  Zheng Zhu,
  Guan Huang,
  Zhizhong Su
</div>

<div align="center" style="line-height: 3;">
  <a href="https://github.com/HorizonRobotics/RoboTransfer" target="_blank" style="margin: 2px;">
    <img alt="Code" src="https://img.shields.io/badge/Code-Github-blue" style="display: inline-block; vertical-align: middle;"/>
  </a>
  <a href="https://horizonrobotics.github.io/robot_lab/robotransfer" target="_blank" style="margin: 2px;">
    <img alt="Project Page" src="https://img.shields.io/badge/🌐-Project_Page-blue" style="display: inline-block; vertical-align: middle;"/>
  </a>
  <a href="https://arxiv.org/abs/2505.23171" target="_blank" style="margin: 2px;">
    <img alt="arXiv" src="https://img.shields.io/badge/πŸ“„-arXiv-b31b1b" style="display: inline-block; vertical-align: middle;"/>
  </a>
  <a href="https://youtu.be/dGXKtqDnm5Q" target="_blank" style="margin: 2px;">
    <img alt="Video" src="https://img.shields.io/badge/πŸŽ₯-Video-red" style="display: inline-block; vertical-align: middle;"/>
  </a>
  <a href="https://mp.weixin.qq.com/s/c9-1HPBMHIy4oEwyKnsT7Q" target="_blank" style="margin: 2px;">
    <img alt="中文介绍" src="https://img.shields.io/badge/中文介绍-07C160?logo=wechat&logoColor=white" style="display: inline-block; vertical-align: middle;"/>
  </a>
</div>

<div align="center">
  <img src="assets/pin.jpg" width="40%" alt="RoboTransfer"/></div>

---

## πŸ” Abstract

![RoboTransfer Pipeline](assets/robotransfer.jpg)

**RoboTransfer** is a novel diffusion-based video generation framework tailored for robotic visual policy transfer. Unlike conventional approaches, RoboTransfer introduces **geometry-aware synthesis** by injecting **depth and normal priors**, ensuring multi-view consistency across dynamic robotic scenes. The method further supports **explicit control over scene components**, such as **background editing**, **object identity swapping**, and **motion specification**, offering a fine-grained video generation pipeline that benefits embodied learning.

---

## 🧠 Key Features

- πŸ“ **Geometry-Consistent Diffusion**: Injects global 3D cues (depth, normal) and cross-view interactions for multi-view realism.
- 🧩 **Scene Component Control**: Enables manipulation of object attributes (pose, identity) and background features.
- πŸ” **Cross-View Conditioning**: Learns representations from multiple camera views with spatial correspondence.
- πŸ€– **Robotic Policy Transfer**: Facilitates domain adaptation by generating synthetic training data in target domains.

---

## πŸ“– BibTeX

```bibtex
@article{liu2025robotransfer,
  title={RoboTransfer: Geometry-Consistent Video Diffusion for Robotic Visual Policy Transfer},
  author={Liu, Liu and Wang, Xiaofeng and Zhao, Guosheng and Li, Keyu and Qin, Wenkang and Qiu, Jiaxiong and Zhu, Zheng and Huang, Guan and Su, Zhizhong},
  journal={arXiv preprint arXiv:2505.23171},
  year={2025}
}