Hugging Face's logo Hugging Face
  • Models
  • Datasets
  • Spaces
  • Docs
  • Enterprise
  • Pricing

  • Log In
  • Sign Up
CelesteChen 's Collections
visual thinker
agent
creative-writing
multimodal
RL infra
application
acceleration
confidence
deepsearch
models
code
diffusion
multilingual
reasoning
RAG
others
long-context
math
Align
LLM-general

visual thinker

updated 2 days ago
Upvote
-

  • Thinking with Video: Video Generation as a Promising Multimodal Reasoning Paradigm

    Paper • 2511.04570 • Published 7 days ago • 187

  • V-Thinker: Interactive Thinking with Images

    Paper • 2511.04460 • Published 7 days ago • 93

  • TIR-Bench: A Comprehensive Benchmark for Agentic Thinking-with-Images Reasoning

    Paper • 2511.01833 • Published 10 days ago • 15

  • ThinkMorph: Emergent Properties in Multimodal Interleaved Chain-of-Thought Reasoning

    Paper • 2510.27492 • Published 14 days ago • 78

  • Video-Thinker: Sparking "Thinking with Videos" via Reinforcement Learning

    Paper • 2510.23473 • Published 17 days ago • 83

  • Open-o3 Video: Grounded Video Reasoning with Explicit Spatio-Temporal Evidence

    Paper • 2510.20579 • Published 21 days ago • 54
Upvote
-
  • Collection guide
  • Browse collections
Company
TOS Privacy About Jobs
Website
Models Datasets Spaces Pricing Docs