Update README data
Browse files
README.md
CHANGED
|
@@ -66,40 +66,45 @@ This model is trained on ``gen-robot/openvla-7b-rlvla-warmup`` by Group Relative
|
|
| 66 |
|
| 67 |
## Full OOD Evaluation and Results
|
| 68 |
### Overall OOD Eval Results
|
| 69 |
-
Note: rl4vla refers to the paper VLA-RL-Study: What Can RL Bring to VLA Generalization? An Empirical Study.
|
| 70 |
-
|
| 71 |
-
|
| 72 |
-
|
|
|
|
|
|
|
| 73 |
### OOD Eval on Vision
|
| 74 |
|
| 75 |
-
| Description
|
| 76 |
-
|
| 77 |
-
| vision avg
|
| 78 |
-
| unseen table
|
| 79 |
-
| dynamic texture (weak) |
|
| 80 |
-
| dynamic texture (strong)
|
| 81 |
-
| dynamic noise (weak)
|
| 82 |
-
| dynamic noise (strong)
|
| 83 |
|
| 84 |
### OOD Eval on Semantic
|
| 85 |
-
|
| 86 |
-
|
| 87 |
-
|
| 88 |
-
|
|
| 89 |
-
|
|
| 90 |
-
| unseen
|
| 91 |
-
| unseen
|
| 92 |
-
|
|
| 93 |
-
| multi-object (both
|
| 94 |
-
|
|
| 95 |
-
|
|
|
|
|
| 96 |
|
| 97 |
### OOD Eval on Position
|
| 98 |
-
|
| 99 |
-
|
| 100 |
-
|
| 101 |
-
|
|
| 102 |
-
|
|
|
|
|
|
|
|
| 103 |
|
| 104 |
## How to Use
|
| 105 |
Please integrate the provided model with the [RLinf](https://github.com/RLinf/RLinf) codebase. To do so, modify the following parameters in the configuration file ``examples/embodiment/config/maniskill_grpo_openvla.yaml``:
|
|
|
|
| 66 |
|
| 67 |
## Full OOD Evaluation and Results
|
| 68 |
### Overall OOD Eval Results
|
| 69 |
+
Note: rl4vla refers to the paper [VLA-RL-Study: What Can RL Bring to VLA Generalization? An Empirical Study](https://arxiv.org/abs/2505.19789).
|
| 70 |
+
|
| 71 |
+
| Description | rl4vla | GRPO-openvlaoft | PPO-openvlaoft | PPO-openvla | __GRPO-openvla__ |
|
| 72 |
+
|-------------|--------|-----------------|----------------|-------------|------------------|
|
| 73 |
+
| Avg results | 76.08 | 61.48 | 64.53 | **82.21** | 75.47 |
|
| 74 |
+
|
| 75 |
### OOD Eval on Vision
|
| 76 |
|
| 77 |
+
| Description | rl4vla | GRPO-openvlaoft | PPO-openvlaoft | PPO-openvla | __GRPO-openvla__ |
|
| 78 |
+
|-------------|--------|-----------------|----------------|-------------|------------------|
|
| 79 |
+
| vision avg | 76.56 | 84.69 | 80.55 | **82.03** | 74.69 |
|
| 80 |
+
| unseen table | 84.40 | 91.41 | 94.53 | **95.70** | 89.84 |
|
| 81 |
+
| dynamic texture (weak) | 83.30 | **91.02** | 82.42 | 85.55 | 78.91 |
|
| 82 |
+
| dynamic texture (strong) | 63.00 | **77.34** | 62.50 | 72.27 | 65.62 |
|
| 83 |
+
| dynamic noise (weak) | 85.40 | 89.45 | **89.84** | 87.11 | 79.69 |
|
| 84 |
+
| dynamic noise (strong) | 66.70 | **74.22** | 73.44 | 69.53 | 59.38 |
|
| 85 |
|
| 86 |
### OOD Eval on Semantic
|
| 87 |
+
|
| 88 |
+
| Description | rl4vla | GRPO-openvlaoft | PPO-openvlaoft | PPO-openvla | __GRPO-openvla__ |
|
| 89 |
+
|-------------|--------|-----------------|----------------|-------------|------------------|
|
| 90 |
+
| object avg | 75.40 | 51.61 | 56.64 | **80.57** | 74.41 |
|
| 91 |
+
| train setting | 93.80 | 94.14 | 91.80 | **96.09** | 84.38 |
|
| 92 |
+
| unseen objects | 71.40 | 80.47 | 77.73 | **81.64** | 76.56 |
|
| 93 |
+
| unseen receptacles | 75.00 | 74.22 | 78.12 | **81.25** | 73.44 |
|
| 94 |
+
| unseen instructions | 89.10 | 67.97 | 68.36 | **94.53** | 89.06 |
|
| 95 |
+
| multi-object (both seen) | 75.00 | 35.16 | 42.97 | **84.38** | 75.78 |
|
| 96 |
+
| multi-object (both unseen) | 57.80 | 30.47 | 38.67 | **62.89** | 57.81 |
|
| 97 |
+
| distractive receptacle | 81.20 | 18.75 | 31.64 | **82.81** | 78.12 |
|
| 98 |
+
| multi-receptacle (both unseen) | 59.90 | 11.72 | 23.83 | **60.94** | 60.16 |
|
| 99 |
|
| 100 |
### OOD Eval on Position
|
| 101 |
+
|
| 102 |
+
| Description | rl4vla | GRPO-openvlaoft | PPO-openvlaoft | PPO-openvla | __GRPO-openvla__ |
|
| 103 |
+
|-------------|--------|-----------------|----------------|-------------|------------------|
|
| 104 |
+
| position avg | 77.60 | 42.97 | 56.05 | **89.26** | 81.64 |
|
| 105 |
+
| unseen position (object & receptacle) | 80.70 | 40.23 | 50.39 | **86.33** | 75.00 |
|
| 106 |
+
| mid-episode object reposition | 74.50 | 45.70 | 61.72 | **92.19** | 88.28 |
|
| 107 |
+
|
| 108 |
|
| 109 |
## How to Use
|
| 110 |
Please integrate the provided model with the [RLinf](https://github.com/RLinf/RLinf) codebase. To do so, modify the following parameters in the configuration file ``examples/embodiment/config/maniskill_grpo_openvla.yaml``:
|