magicwpf's picture

magicwpf

magicwpf

·

https://magicwpf.github.io/

AI & ML interests

None yet

Recent Activity

upvoted a paper 1 day ago

Kling-MotionControl Technical Report

upvoted a paper 21 days ago

TimeChat-Captioner: Scripting Multi-Scene Videos with Time-Aware and Structural Audio-Visual Captions

upvoted a paper 28 days ago

Semantic Routing: Exploring Multi-Layer LLM Feature Weighting for Diffusion Transformers

View all activity

Organizations

None yet

upvoted a paper 1 day ago

Kling-MotionControl Technical Report

Paper • 2603.03160 • Published 2 days ago • 24

upvoted a paper 21 days ago

TimeChat-Captioner: Scripting Multi-Scene Videos with Time-Aware and Structural Audio-Visual Captions

Paper • 2602.08711 • Published 24 days ago • 28

upvoted 2 papers 28 days ago

Semantic Routing: Exploring Multi-Layer LLM Feature Weighting for Diffusion Transformers

Paper • 2602.03510 • Published about 1 month ago • 27

OmniSIFT: Modality-Asymmetric Token Compression for Efficient Omni-modal Large Language Models

Paper • 2602.04804 • Published 29 days ago • 46

upvoted 2 papers 30 days ago

3D-Aware Implicit Motion Control for View-Adaptive Human Video Generation

Paper • 2602.03796 • Published about 1 month ago • 62

Research on World Models Is Not Merely Injecting World Knowledge into Specific Tasks

Paper • 2602.01630 • Published Feb 2 • 46

upvoted a paper about 1 month ago

SALAD: Achieve High-Sparsity Attention via Efficient Linear Attention Tuning for Video Diffusion Transformer

Paper • 2601.16515 • Published Jan 23 • 15

upvoted 2 papers about 2 months ago

CoF-T2I: Video Models as Pure Visual Reasoners for Text-to-Image Generation

Paper • 2601.10061 • Published Jan 15 • 31

GARDO: Reinforcing Diffusion Models without Reward Hacking

Paper • 2512.24138 • Published Dec 30, 2025 • 29

upvoted 3 papers 2 months ago

GRAN-TED: Generating Robust, Aligned, and Nuanced Text Embedding for Diffusion Models

Paper • 2512.15560 • Published Dec 17, 2025 • 25

T2AV-Compass: Towards Unified Evaluation for Text-to-Audio-Video Generation

Paper • 2512.21094 • Published Dec 24, 2025 • 25

SemanticGen: Video Generation in Semantic Space

Paper • 2512.20619 • Published Dec 23, 2025 • 93

upvoted 8 papers 3 months ago

Alchemist: Unlocking Efficiency in Text-to-Image Model Training via Meta-Gradient Data Selection

Paper • 2512.16905 • Published Dec 18, 2025 • 32

StereoPilot: Learning Unified and Efficient Stereo Conversion via Generative Priors

Paper • 2512.16915 • Published Dec 18, 2025 • 38

Kling-Omni Technical Report

Paper • 2512.16776 • Published Dec 18, 2025 • 171

MemFlow: Flowing Adaptive Memory for Consistent and Efficient Long Video Narratives

Paper • 2512.14699 • Published Dec 16, 2025 • 28

Scone: Bridging Composition and Distinction in Subject-Driven Image Generation via Unified Understanding-Generation Modeling

Paper • 2512.12675 • Published Dec 14, 2025 • 41

KlingAvatar 2.0 Technical Report

Paper • 2512.13313 • Published Dec 15, 2025 • 43

SVG-T2I: Scaling Up Text-to-Image Latent Diffusion Model Without Variational Autoencoder

Paper • 2512.11749 • Published Dec 12, 2025 • 39

UnityVideo: Unified Multi-Modal Multi-Task Learning for Enhancing World-Aware Video Generation

Paper • 2512.07831 • Published Dec 8, 2025 • 17