Ember

Team

company

Activity Feed

AI & ML interests

None defined yet.

Recent Activity

Zhiding authored a paper 2 days ago

PhyCritic: Multimodal Critic Models for Physical AI

valtsblukis authored a paper 2 months ago

SpaceTools: Tool-Augmented Spatial Reasoning via Double Interactive RL

k-nick authored a paper 2 months ago

FLARE: Robot Learning with Implicit World Modeling

View all activity

Team members 9
private

Zhiding

authored a paper 2 days ago

PhyCritic: Multimodal Critic Models for Physical AI

Paper • 2602.11124 • Published 3 days ago • 49

valtsblukis

authored a paper 2 months ago

SpaceTools: Tool-Augmented Spatial Reasoning via Double Interactive RL

Paper • 2512.04069 • Published Dec 3, 2025 • 23

k-nick

authored a paper 2 months ago

FLARE: Robot Learning with Implicit World Modeling

Paper • 2505.15659 • Published May 21, 2025

k-nick

authored a paper 5 months ago

MMR1: Enhancing Multimodal Reasoning with Variance-Aware Sampling and Open Resources

Paper • 2509.21268 • Published Sep 25, 2025 • 104

k-nick

authored a paper 9 months ago

DreamGen: Unlocking Generalization in Robot Learning through Neural Trajectories

Paper • 2505.12705 • Published May 19, 2025

zwrq

authored a paper 10 months ago

Describe Anything: Detailed Localized Image and Video Captioning

Paper • 2504.16072 • Published Apr 22, 2025 • 63

Zhiding

authored a paper 10 months ago

Eagle 2.5: Boosting Long-Context Post-Training for Frontier Vision-Language Models

Paper • 2504.15271 • Published Apr 21, 2025 • 67

RealZhiqiLi

authored a paper 10 months ago

Eagle 2.5: Boosting Long-Context Post-Training for Frontier Vision-Language Models

Paper • 2504.15271 • Published Apr 21, 2025 • 67

k-nick

authored 2 papers 11 months ago

GR00T N1: An Open Foundation Model for Generalist Humanoid Robots

Paper • 2503.14734 • Published Mar 18, 2025 • 6

Error Analyses of Auto-Regressive Video Diffusion Models: A Unified Framework

Paper • 2503.10704 • Published Mar 12, 2025 • 5

Zhiding

authored a paper 11 months ago

Token-Efficient Long Video Understanding for Multimodal LLMs

Paper • 2503.04130 • Published Mar 6, 2025 • 96

RealZhiqiLi

authored 8 papers about 1 year ago

FB-BEV: BEV Representation from Forward-Backward View Transformations

Paper • 2308.02236 • Published Aug 4, 2023

Panoptic SegFormer: Delving Deeper into Panoptic Segmentation with Transformers

Paper • 2109.03814 • Published Sep 8, 2021

Efficient Deformable ConvNets: Rethinking Dynamic and Sparse Operator for Vision Applications

Paper • 2401.06197 • Published Jan 11, 2024 • 1

DriveMLM: Aligning Multi-Modal Large Language Models with Behavioral Planning States for Autonomous Driving

Paper • 2312.09245 • Published Dec 14, 2023

Video Mamba Suite: State Space Model as a Versatile Alternative for Video Understanding

Paper • 2403.09626 • Published Mar 14, 2024 • 15

InternImage: Exploring Large-Scale Vision Foundation Models with Deformable Convolutions

Paper • 2211.05778 • Published Nov 10, 2022

Driving with InternVL: Oustanding Champion in the Track on Driving with Language of the Autonomous Grand Challenge at CVPR 2024

Paper • 2412.07247 • Published Dec 10, 2024

Eagle 2: Building Post-Training Data Strategies from Scratch for Frontier Vision-Language Models

Paper • 2501.14818 • Published Jan 20, 2025 • 9

zwrq

authored a paper about 1 year ago

Omni-RGPT: Unifying Image and Video Region-level Understanding via Token Marks

Paper • 2501.08326 • Published Jan 14, 2025 • 33

AI & ML interests

Recent Activity

Team members 9 private

NVEmber's activity

Team members 9
private