M2-CLIP: A Multimodal, Multi-task Adapting Framework for Video Action Recognition Paper • 2401.11649 • Published Jan 22, 2024 • 3
DreamForge: Motion-Aware Autoregressive Video Generation for Multi-View Driving Scenes Paper • 2409.04003 • Published Sep 6, 2024 • 1
O$^2$-Searcher: A Searching-based Agent Model for Open-Domain Open-Ended Question Answering Paper • 2505.16582 • Published May 22
KG-TRACES: Enhancing Large Language Models with Knowledge Graph-constrained Trajectory Reasoning and Attribution Supervision Paper • 2506.00783 • Published Jun 1 • 1
CogDDN: A Cognitive Demand-Driven Navigation with Decision Optimization and Dual-Process Thinking Paper • 2507.11334 • Published Jul 15 • 1
IWR-Bench: Can LVLMs reconstruct interactive webpage from a user interaction video? Paper • 2509.24709 • Published 25 days ago • 5
RE-Searcher: Robust Agentic Search with Goal-oriented Planning and Self-reflection Paper • 2509.26048 • Published 24 days ago • 7
Learning on the Job: An Experience-Driven Self-Evolving Agent for Long-Horizon Tasks Paper • 2510.08002 • Published 15 days ago • 22