GENIUS: Generative Fluid Intelligence Evaluation Suite Paper • 2602.11144 • Published 18 days ago • 53
GENIUS: Generative Fluid Intelligence Evaluation Suite Paper • 2602.11144 • Published 18 days ago • 53
GENIUS: Generative Fluid Intelligence Evaluation Suite Paper • 2602.11144 • Published 18 days ago • 53
Step 3.5 Flash: Open Frontier-Level Intelligence with 11B Active Parameters Paper • 2602.10604 • Published 19 days ago • 185
GEBench: Benchmarking Image Generation Models as GUI Environments Paper • 2602.09007 • Published 20 days ago • 39
GEBench: Benchmarking Image Generation Models as GUI Environments Paper • 2602.09007 • Published 20 days ago • 39
GEBench: Benchmarking Image Generation Models as GUI Environments Paper • 2602.09007 • Published 20 days ago • 39
How Well Do Models Follow Visual Instructions? VIBE: A Systematic Benchmark for Visual Instruction-Driven Image Editing Paper • 2602.01851 • Published 28 days ago • 16
VIBE Model Results Collection This collection archives the raw output generations from various models evaluated on the VIBE benchmark. • 16 items • Updated 29 days ago • 2
How Well Do Models Follow Visual Instructions? VIBE: A Systematic Benchmark for Visual Instruction-Driven Image Editing Paper • 2602.01851 • Published 28 days ago • 16