A Series of Open-Source RL-Finetuned Hierarchical Agentic Models
AI & ML interests
None defined yet.
Recent Activity
GROOT is a research series investigating how self-supervised and weakly supervised learning can be used to train agents that follow instructions.
Vision-Language-Action Models in Minecraft.
-
CraftJarvis/JarvisVLA-Qwen2-VL-7B
Image-Text-to-Text • 8B • Updated • 53 • 10 -
JARVIS-VLA: Post-Training Large-Scale Vision Language Models to Play Visual Games with Keyboards and Mouse
Paper • 2503.16365 • Published • 40 -
Minecraft VLM Leaderboard
🏢12Search and view Minecraft LLM leaderboard
-
CraftJarvis/minecraft-vla-sft
Viewer • Updated • 3.78M • 914 • 10
A Series of Open-Source Hierarchical Agentic Models & Datasets in Minecraft
-
OpenHA: A Series of Open-Source Hierarchical Agentic Models in Minecraft
Paper • 2509.13347 • Published • 1 -
CraftJarvis/minecraft-openha-qwen2vl-7b-2509
Image-Text-to-Text • 8B • Updated • 1.23k • 1 -
CraftJarvis/minecraft-textvla-qwen2vl-7b-2509
Image-Text-to-Text • 8B • Updated • 16 • 1 -
CraftJarvis/minecraft-pointha-qwen2vl-7b-2509
Image-Text-to-Text • 8B • Updated • 2
ROCKET is the research series that explores vision-based goal specification methods.
-
Scalable Multi-Task Reinforcement Learning for Generalizable Spatial Intelligence in Visuomotor Agents
Paper • 2507.23698 • Published • 10 -
ROCKET-2: Steering Visuomotor Policy via Cross-View Goal Alignment
Paper • 2503.02505 • Published • 7 -
ROCKET-1: Master Open-World Interaction with Visual-Temporal Context Prompting
Paper • 2410.17856 • Published • 52 -
phython96/ROCKET-1
Robotics • Updated • 15 • 5
A Series of Open-Source RL-Finetuned Hierarchical Agentic Models
A Series of Open-Source Hierarchical Agentic Models & Datasets in Minecraft
-
OpenHA: A Series of Open-Source Hierarchical Agentic Models in Minecraft
Paper • 2509.13347 • Published • 1 -
CraftJarvis/minecraft-openha-qwen2vl-7b-2509
Image-Text-to-Text • 8B • Updated • 1.23k • 1 -
CraftJarvis/minecraft-textvla-qwen2vl-7b-2509
Image-Text-to-Text • 8B • Updated • 16 • 1 -
CraftJarvis/minecraft-pointha-qwen2vl-7b-2509
Image-Text-to-Text • 8B • Updated • 2
GROOT is a research series investigating how self-supervised and weakly supervised learning can be used to train agents that follow instructions.
ROCKET is the research series that explores vision-based goal specification methods.
-
Scalable Multi-Task Reinforcement Learning for Generalizable Spatial Intelligence in Visuomotor Agents
Paper • 2507.23698 • Published • 10 -
ROCKET-2: Steering Visuomotor Policy via Cross-View Goal Alignment
Paper • 2503.02505 • Published • 7 -
ROCKET-1: Master Open-World Interaction with Visual-Temporal Context Prompting
Paper • 2410.17856 • Published • 52 -
phython96/ROCKET-1
Robotics • Updated • 15 • 5
Vision-Language-Action Models in Minecraft.
-
CraftJarvis/JarvisVLA-Qwen2-VL-7B
Image-Text-to-Text • 8B • Updated • 53 • 10 -
JARVIS-VLA: Post-Training Large-Scale Vision Language Models to Play Visual Games with Keyboards and Mouse
Paper • 2503.16365 • Published • 40 -
Minecraft VLM Leaderboard
🏢12Search and view Minecraft LLM leaderboard
-
CraftJarvis/minecraft-vla-sft
Viewer • Updated • 3.78M • 914 • 10