Steven Gay's picture

537 137

Steven Gay PRO

StevenG640

·

AI & ML interests

None yet

Recent Activity

liked a dataset about 3 hours ago

AILab-CVC/SEED-Data-Edit-Part1-Unsplash

liked a dataset about 3 hours ago

AILab-CVC/SEED-Data-Edit-Part1-Openimages

liked a dataset about 3 hours ago

AILab-CVC/SEED-Data-Edit-Part2-3

View all activity

Organizations

upvoted a paper about 3 hours ago

SEED-Data-Edit Technical Report: A Hybrid Dataset for Instructional Image Editing

Paper • 2405.04007 • Published May 7, 2024 • 1

upvoted 3 papers 2 days ago

Chimera: Compositional Image Generation using Part-based Concepting

Paper • 2510.18083 • Published Oct 20 • 1

MaskAttn-SDXL: Controllable Region-Level Text-To-Image Generation

Paper • 2509.15357 • Published Sep 18 • 1

Structured Information for Improving Spatial Relationships in Text-to-Image Generation

Paper • 2509.15962 • Published Sep 19 • 1

upvoted a paper 4 days ago

MMaDA-Parallel: Multimodal Large Diffusion Language Models for Thinking-Aware Editing and Generation

Paper • 2511.09611 • Published 10 days ago • 63

upvoted 2 papers 6 days ago

TiDAR: Think in Diffusion, Talk in Autoregression

Paper • 2511.08923 • Published 11 days ago • 96

Motif 2 12.7B technical report

Paper • 2511.07464 • Published 15 days ago • 38

upvoted a paper 7 days ago

Generating an Image From 1,000 Words: Enhancing Text-to-Image With Structured Captions

Paper • 2511.06876 • Published 12 days ago • 22

upvoted a collection 7 days ago

Emu3.5

Native Multimodal Models are World Learners 🌍 • 4 items • Updated 10 days ago • 71

upvoted 11 papers 7 days ago

Simulating the Visual World with Artificial Intelligence: A Roadmap

Paper • 2511.08585 • Published 11 days ago • 28

LongScape: Advancing Long-Horizon Embodied World Models with Context-Aware MoE

Paper • 2509.21790 • Published Sep 26 • 1

A Comprehensive Survey on World Models for Embodied AI

Paper • 2510.16732 • Published Oct 19 • 1

Vision Foundation Models Can Be Good Tokenizers for Latent Diffusion Models

Paper • 2510.18457 • Published Oct 21 • 3

RoboChallenge: Large-scale Real-robot Evaluation of Embodied Policies

Paper • 2510.17950 • Published Oct 20 • 7

FSFSplatter: Build Surface and Novel Views with Sparse-Views within 2min

Paper • 2510.02691 • Published Oct 3 • 1

YoNoSplat: You Only Need One Model for Feedforward 3D Gaussian Splatting

Paper • 2511.07321 • Published 12 days ago • 1

PLANA3R: Zero-shot Metric Planar 3D Reconstruction via Feed-Forward Planar Splatting

Paper • 2510.18714 • Published Oct 21 • 1

MapAnything: Universal Feed-Forward Metric 3D Reconstruction

Paper • 2509.13414 • Published Sep 16 • 2

MoRE: 3D Visual Geometry Reconstruction Meets Mixture-of-Experts

Paper • 2510.27234 • Published 23 days ago • 1

SPFSplatV2: Efficient Self-Supervised Pose-Free 3D Gaussian Splatting from Sparse Views

Paper • 2509.17246 • Published Sep 21 • 2