WebMMU: A Benchmark for Multimodal Multilingual Website Understanding and Code Generation Paper • 2508.16763 • Published Aug 22 • 2
BigCharts-R1: Enhanced Chart Reasoning with Visual Reinforcement Finetuning Paper • 2508.09804 • Published Aug 13
Scope: Selective Cross-modal Orchestration of Visual Perception Experts Paper • 2510.12974 • Published Oct 14
When Can Transformers Ground and Compose: Insights from Compositional Generalization Benchmarks Paper • 2210.12786 • Published Oct 23, 2022
Decoding the Enigma: Benchmarking Humans and AIs on the Many Facets of Working Memory Paper • 2307.10768 • Published Jul 20, 2023
MAPL: Parameter-Efficient Adaptation of Unimodal Pre-Trained Models for Vision-Language Few-Shot Prompting Paper • 2210.07179 • Published Oct 13, 2022 • 3
Learning to Learn: How to Continuously Teach Humans and Machines Paper • 2211.15470 • Published Nov 28, 2022