Misc - a samsam55 Collection

samsam55 's Collections

Coding Agents (Games)

Reinforcement Learning Etc..

Run on CPU Optimizations

World View Creation (out painting 3D)

Visual Multi Modal LLM

TTS & Speech to Text

3D Models & Modeling

Misc

updated 2 days ago

UNIDOC-BENCH: A Unified Benchmark for Document-Centric Multimodal RAG

Paper • 2510.03663 • Published Oct 4, 2025 • 16
LLM-guided Hierarchical Retrieval

Paper • 2510.13217 • Published Oct 15, 2025 • 21
AnyUp: Universal Feature Upsampling

Paper • 2510.12764 • Published Oct 14, 2025 • 12
katanemo/Arch-Router-1.5B

Text Generation • 2B • Updated Apr 2 • 3.33k • • 261
nvidia/Audio2Face-3D-v3.0

Updated Oct 21, 2025 • 668 • 74
Rank-GRPO: Training LLM-based Conversational Recommender Systems with Reinforcement Learning

Paper • 2510.20150 • Published Oct 23, 2025 • 7
WildDet3D: Scaling Promptable 3D Detection in the Wild

Paper • 2604.08626 • Published Apr 9 • 245
Chat2Workflow: A Benchmark for Generating Executable Visual Workflows with Natural Language

Paper • 2604.19667 • Published 20 days ago • 22
The Last Harness You'll Ever Build

Paper • 2604.21003 • Published 19 days ago • 3
From Context to Skills: Can Language Models Learn from Context Skillfully?

Paper • 2604.27660 • Published 8 days ago • 148
From Skill Text to Skill Structure: The Scheduling-Structural-Logical Representation for Agent Skills

Paper • 2604.24026 • Published 14 days ago • 21