Guian Fang's picture

Guian Fang PRO

Enderfga

·

https://enderfga.cn/

AI & ML interests

nlp,cv

Recent Activity

upvoted a paper 26 days ago

UniLumos: Fast and Unified Image and Video Relighting with Physics-Plausible Feedback

published a model about 2 months ago

Enderfga/wan

upvoted a paper about 2 months ago

Stencil: Subject-Driven Generation with Context Guidance

View all activity

Organizations

upvoted a paper 26 days ago

UniLumos: Fast and Unified Image and Video Relighting with Physics-Plausible Feedback

Paper • 2511.01678 • Published 27 days ago • 34

upvoted 3 papers about 2 months ago

Stencil: Subject-Driven Generation with Context Guidance

Paper • 2509.17120 • Published Sep 21 • 6

VChain: Chain-of-Visual-Thought for Reasoning in Video Generation

Paper • 2510.05094 • Published Oct 6 • 37

Paper2Video: Automatic Video Generation from Scientific Papers

Paper • 2510.05096 • Published Oct 6 • 115

upvoted 3 papers 6 months ago

UniRL: Self-Improving Unified Multimodal Models via Supervised and Reinforcement Learning

Paper • 2505.23380 • Published May 29 • 22

Paper2Poster: Towards Multimodal Poster Automation from Scientific Papers

Paper • 2505.21497 • Published May 27 • 109

OmniConsistency: Learning Style-Agnostic Consistency from Paired Stylization Data

Paper • 2505.18445 • Published May 24 • 64

upvoted a paper 8 months ago

Long-Context Autoregressive Video Modeling with Next-Frame Prediction

Paper • 2503.19325 • Published Mar 25 • 73

upvoted 4 papers 9 months ago

Impossible Videos

Paper • 2503.14378 • Published Mar 18 • 61

Automated Movie Generation via Multi-Agent CoT Planning

Paper • 2503.07314 • Published Mar 10 • 44

Difix3D+: Improving 3D Reconstructions with Single-Step Diffusion Models

Paper • 2503.01774 • Published Mar 3 • 44

PhotoDoodle: Learning Artistic Image Editing from Few-Shot Pairwise Data

Paper • 2502.14397 • Published Feb 20 • 41

upvoted 3 papers 10 months ago

WorldGUI: Dynamic Testing for Comprehensive Desktop GUI Automation

Paper • 2502.08047 • Published Feb 12 • 28

TextAtlas5M: A Large-scale Dataset for Dense Text Image Generation

Paper • 2502.07870 • Published Feb 11 • 46

MakeAnything: Harnessing Diffusion Transformers for Multi-Domain Procedural Sequence Generation

Paper • 2502.01572 • Published Feb 3 • 21

upvoted 5 papers about 1 year ago

ROICtrl: Boosting Instance Control for Visual Generation

Paper • 2411.17949 • Published Nov 27, 2024 • 87

ShowUI: One Vision-Language-Action Model for GUI Visual Agent

Paper • 2411.17465 • Published Nov 26, 2024 • 90

ReCapture: Generative Video Camera Controls for User-Provided Videos using Masked Video Fine-Tuning

Paper • 2411.05003 • Published Nov 7, 2024 • 71

EvolveDirector: Approaching Advanced Text-to-Image Generation with Large Vision-Language Models

Paper • 2410.07133 • Published Oct 9, 2024 • 19

ChartThinker: A Contextual Chain-of-Thought Approach to Optimized Chart Summarization

Paper • 2403.11236 • Published Mar 17, 2024 • 1