UI2Code^N: A Visual Language Model for Test-Time Scalable Interactive UI-to-Code Generation Paper • 2511.08195 • Published 12 days ago • 30
CogCoM: Train Large Vision-Language Models Diving into Details through Chain of Manipulations Paper • 2402.04236 • Published Feb 6, 2024 • 9
GLM-4.5 Collection GLM-4.5: An open-source large language model designed for intelligent agents by Z.ai • 11 items • Updated Aug 11 • 247
GLM-4.1V-Thinking: Towards Versatile Multimodal Reasoning with Scalable Reinforcement Learning Paper • 2507.01006 • Published Jul 1 • 238
An LMM for Efficient Video Understanding via Reinforced Compression of Video Cubes Paper • 2504.15270 • Published Apr 21 • 9
ExpLLM: Towards Chain of Thought for Facial Expression Recognition Paper • 2409.02828 • Published Sep 4, 2024
CogVLM2: Visual Language Models for Image and Video Understanding Paper • 2408.16500 • Published Aug 29, 2024 • 57
LVBench: An Extreme Long Video Understanding Benchmark Paper • 2406.08035 • Published Jun 12, 2024 • 1
CogCoM: Train Large Vision-Language Models Diving into Details through Chain of Manipulations Paper • 2402.04236 • Published Feb 6, 2024 • 9
LongAlign: A Recipe for Long Context Alignment of Large Language Models Paper • 2401.18058 • Published Jan 31, 2024 • 22
KoLA: Carefully Benchmarking World Knowledge of Large Language Models Paper • 2306.09296 • Published Jun 15, 2023 • 19
GOAL: A Challenging Knowledge-grounded Video Captioning Benchmark for Real-time Soccer Commentary Generation Paper • 2303.14655 • Published Mar 26, 2023
GLM-4.1V-Thinking: Towards Versatile Multimodal Reasoning with Scalable Reinforcement Learning Paper • 2507.01006 • Published Jul 1 • 238