PointArena: Probing Multimodal Grounding Through Language-Guided Pointing Paper • 2505.09990 • Published May 15, 2025 • 12
Style Customization of Text-to-Vector Generation with Image Diffusion Priors Paper • 2505.10558 • Published May 15, 2025 • 16
Exploring the Deep Fusion of Large Language Models and Diffusion Transformers for Text-to-Image Synthesis Paper • 2505.10046 • Published May 15, 2025 • 9
X-Sim: Cross-Embodiment Learning via Real-to-Sim-to-Real Paper • 2505.07096 • Published May 11, 2025 • 5