arxiv:2603.28032

CARLA-Air: Fly Drones Inside a CARLA World -- A Unified Infrastructure for Air-Ground Embodied Intelligence

Published on Mar 30

· Submitted by

tianlezeng on Apr 1

#1 Paper of the day

Upvote

232

Authors:

Tianle Zeng ,

Abstract

CARLA-Air integrates high-fidelity driving and multirotor flight simulation within a unified Unreal Engine framework, supporting joint air-ground agent modeling with photorealistic environments and multi-modal sensing capabilities.

AI-generated summary

The convergence of low-altitude economies, embodied intelligence, and air-ground cooperative systems creates growing demand for simulation infrastructure capable of jointly modeling aerial and ground agents within a single physically coherent environment. Existing open-source platforms remain domain-segregated: driving simulators lack aerial dynamics, while multirotor simulators lack realistic ground scenes. Bridge-based co-simulation introduces synchronization overhead and cannot guarantee strict spatial-temporal consistency. We present CARLA-Air, an open-source infrastructure that unifies high-fidelity urban driving and physics-accurate multirotor flight within a single Unreal Engine process. The platform preserves both CARLA and AirSim native Python APIs and ROS 2 interfaces, enabling zero-modification code reuse. Within a shared physics tick and rendering pipeline, CARLA-Air delivers photorealistic environments with rule-compliant traffic, socially-aware pedestrians, and aerodynamically consistent UAV dynamics, synchronously capturing up to 18 sensor modalities across all platforms at each tick. The platform supports representative air-ground embodied intelligence workloads spanning cooperation, embodied navigation and vision-language action, multi-modal perception and dataset construction, and reinforcement-learning-based policy training. An extensible asset pipeline allows integration of custom robot platforms into the shared world. By inheriting AirSim's aerial capabilities -- whose upstream development has been archived -- CARLA-Air ensures this widely adopted flight stack continues to evolve within a modern infrastructure. Released with prebuilt binaries and full source: https://github.com/louiszengCN/CarlaAir

View arXiv page View PDF Project page GitHub 250 Add to collection

Community

tianlezeng

Paper author Paper submitter 1 day ago

CARLA-Air is an open-source infrastructure that unifies high-fidelity urban driving and physics-accurate multirotor flight within a single Unreal Engine process, providing a practical simulation foundation for air-ground embodied intelligence research.

avahal

about 21 hours ago

carla-air's idea of a single unreal engine world that unifies carla driving and airsim flight in one process is a clean way to unblock air-ground embodied research. my main worry is how robust the shared physics tick stays under heavier workloads—when you scale agents and sensors, could rpc latency or frame-queue pressure introduce subtle spatial-temporal drift even with the unified world? an ablation showing the impact of removing either the ground or aerial module on downstream perception and control would help confirm where the joint gains come from. btw, the arxivlens breakdown (https://arxivlens.com/PaperView/Details/carla-air-fly-drones-inside-a-carla-world-a-unified-infrastructure-for-air-ground-embodied-intelligence-8595-517d7330) helped me parse the method details.

tianlezeng

Paper author about 2 hours ago

Thanks for the thoughtful feedback! The drift concern under heavier workloads is something we're actively thinking about—the unified tick does help a lot, but you're right that frame-queue pressure at scale is still a potential weak point. We're planning to run more systematic stress tests on that front.
The ablation is a great suggestion and honestly something we want to add in a follow-up.