oh sehun's picture

oh sehun

sehun

·

AI & ML interests

None yet

Recent Activity

upvoted a paper 1 day ago

Emergent temporal abstractions in autoregressive models enable hierarchical reinforcement learning

upvoted a paper 2 days ago

Latent Implicit Visual Reasoning

upvoted a paper 3 days ago

Beyond Memorization: A Multi-Modal Ordinal Regression Benchmark to Expose Popularity Bias in Vision-Language Models

View all activity

Organizations

upvoted a paper 1 day ago

Emergent temporal abstractions in autoregressive models enable hierarchical reinforcement learning

Paper • 2512.20605 • Published 5 days ago • 47

upvoted a paper 2 days ago

Latent Implicit Visual Reasoning

Paper • 2512.21218 • Published 4 days ago • 52

upvoted 3 papers 3 days ago

Beyond Memorization: A Multi-Modal Ordinal Regression Benchmark to Expose Popularity Bias in Vision-Language Models

Paper • 2512.21337 • Published 4 days ago • 24

Learning to Reason in 4D: Dynamic Spatial Understanding for Vision Language Models

Paper • 2512.20557 • Published 5 days ago • 46

LongVideoAgent: Multi-Agent Reasoning with Long Videos

Paper • 2512.20618 • Published 5 days ago • 49

upvoted 4 papers 4 days ago

Reasoning Palette: Modulating Reasoning via Latent Contextualization for Controllable Exploration for (V)LMs

Paper • 2512.17206 • Published 10 days ago • 17

Reinforcement Learning for Self-Improving Agent with Skill Library

Paper • 2512.17102 • Published 10 days ago • 26

CASA: Cross-Attention via Self-Attention for Efficient Vision-Language Fusion

Paper • 2512.19535 • Published 6 days ago • 10

Bottom-up Policy Optimization: Your Language Model Policy Secretly Contains Internal Policies

Paper • 2512.19673 • Published 6 days ago • 59

upvoted 2 papers 5 days ago

QuCo-RAG: Quantifying Uncertainty from the Pre-training Corpus for Dynamic Retrieval-Augmented Generation

Paper • 2512.19134 • Published 7 days ago • 31

WorldWarp: Propagating 3D Geometry with Asynchronous Video Diffusion

Paper • 2512.19678 • Published 6 days ago • 27

upvoted 5 papers 6 days ago

Physics of Language Models: Part 4.1, Architecture Design and the Magic of Canon Layers

Paper • 2512.17351 • Published 10 days ago • 22

Next-Embedding Prediction Makes Strong Vision Learners

Paper • 2512.16922 • Published 10 days ago • 80

4D-RGPT: Toward Region-level 4D Understanding via Perceptual Distillation

Paper • 2512.17012 • Published 10 days ago • 42

Both Semantics and Reconstruction Matter: Making Representation Encoders Ready for Text-to-Image Generation and Editing

Paper • 2512.17909 • Published 9 days ago • 36

Turn-PPO: Turn-Level Advantage Estimation with PPO for Improved Multi-Turn RL in Agentic LLMs

Paper • 2512.17008 • Published 10 days ago • 10

upvoted a paper 7 days ago

Differences That Matter: Auditing Models for Capability Gap Discovery and Rectification

Paper • 2512.16921 • Published 10 days ago • 7

upvoted an article 9 days ago

Article

LLM based Audio models

11 days ago

•

46

upvoted 2 papers 9 days ago

Kling-Omni Technical Report

Paper • 2512.16776 • Published 10 days ago • 160

HyperVL: An Efficient and Dynamic Multimodal Large Language Model for Edge Devices

Paper • 2512.14052 • Published 13 days ago • 39