pg-team

community

AI & ML interests

None defined yet.

Recent Activity

Cyril666 authored a paper 2 days ago

VideoLLaMA 3: Frontier Multimodal Foundation Models for Image and Video Understanding

Cyril666 authored a paper 2 days ago

What Is a Good Caption? A Comprehensive Visual Caption Benchmark for Evaluating Both Correctness and Thoroughness

Cyril666 authored a paper 2 days ago

MMR1: Enhancing Multimodal Reasoning with Variance-Aware Sampling and Open Resources

View all activity

authored 7 papers 2 days ago

VideoLLaMA 3: Frontier Multimodal Foundation Models for Image and Video Understanding

Paper • 2501.13106 • Published Jan 22, 2025 • 90

What Is a Good Caption? A Comprehensive Visual Caption Benchmark for Evaluating Both Correctness and Thoroughness

Paper • 2502.14914 • Published Feb 19, 2025

MMR1: Enhancing Multimodal Reasoning with Variance-Aware Sampling and Open Resources

Paper • 2509.21268 • Published Sep 25, 2025 • 104

N3D-VLM: Native 3D Grounding Enables Accurate Spatial Reasoning in Vision-Language Models

Paper • 2512.16561 • Published Dec 18, 2025 • 20

Chain of Ideas: Revolutionizing Research in Novel Idea Development with LLM Agents

Paper • 2410.13185 • Published Oct 17, 2024 • 5

Focus on the Whole Character: Discriminative Character Modeling for Scene Text Recognition

Paper • 2407.05562 • Published Jul 8, 2024

VideoRefer Suite: Advancing Spatial-Temporal Object Understanding with Video LLM

Paper • 2501.00599 • Published Dec 31, 2024 • 46

TainU

authored a paper 3 months ago

RePlan: Reasoning-guided Region Planning for Complex Instruction-based Image Editing

Paper • 2512.16864 • Published Dec 18, 2025 • 11

lkeab

submitted a paper to Daily Papers 3 months ago

N3D-VLM: Native 3D Grounding Enables Accurate Spatial Reasoning in Vision-Language Models

Paper • 2512.16561 • Published Dec 18, 2025 • 20

TainU

submitted a paper to Daily Papers 3 months ago

RePlan: Reasoning-guided Region Planning for Complex Instruction-based Image Editing

Paper • 2512.16864 • Published Dec 18, 2025 • 11

lkeab

authored 10 papers 3 months ago

Mask Transfiner for High-Quality Instance Segmentation

Paper • 2111.13673 • Published Nov 26, 2021

Cascade-DETR: Delving into High-Quality Universal Object Detection

Paper • 2307.11035 • Published Jul 20, 2023

Gaussian Grouping: Segment and Edit Anything in 3D Scenes

Paper • 2312.00732 • Published Dec 1, 2023 • 3

Mask-Free Video Instance Segmentation

Paper • 2303.15904 • Published Mar 28, 2023

Matching Anything by Segmenting Anything

Paper • 2406.04221 • Published Jun 6, 2024

DreamScene4D: Dynamic Multi-Object Scene Generation from Monocular Videos

Paper • 2405.02280 • Published May 3, 2024

SLAck: Semantic, Location, and Appearance Aware Open-Vocabulary Tracking

Paper • 2409.11235 • Published Sep 17, 2024

Multi-View 3D Point Tracking

Paper • 2508.21060 • Published Aug 28, 2025 • 23

RobotArena $\infty$: Scalable Robot Benchmarking via Real-to-Sim Translation

Paper • 2510.23571 • Published Oct 27, 2025 • 9

MotionEdit: Benchmarking and Learning Motion-Centric Image Editing

Paper • 2512.10284 • Published Dec 11, 2025 • 26