David PRO
davidrd123
AI & ML interests
None yet
Recent Activity
liked
a Space
1 day ago
tencent/HunyuanVideo-Foley
updated
a model
6 days ago
davidrd123/gr4f1tt0_v1_qwen
liked
a Space
7 days ago
Wan-AI/Wan2.2-S2V
Organizations
Quantization
VLM
-
SigLIP 2: Multilingual Vision-Language Encoders with Improved Semantic Understanding, Localization, and Dense Features
Paper • 2502.14786 • Published • 146 -
LongWriter-V: Enabling Ultra-Long and High-Fidelity Generation in Vision-Language Models
Paper • 2502.14834 • Published • 24 -
Qwen2.5-VL Technical Report
Paper • 2502.13923 • Published • 203 -
DICEPTION: A Generalist Diffusion Model for Visual Perceptual Tasks
Paper • 2502.17157 • Published • 53
Video
Memory
LLM
-
Does Time Have Its Place? Temporal Heads: Where Language Models Recall Time-specific Information
Paper • 2502.14258 • Published • 26 -
Language Modeling Is Compression
Paper • 2309.10668 • Published • 83 -
Contrastive Decoding Improves Reasoning in Large Language Models
Paper • 2309.09117 • Published • 39 -
Qwen Technical Report
Paper • 2309.16609 • Published • 37
ImgEdit
ImgGen & Style
Diffusion Models + Means of Improving Quality + Style
-
Edify Image: High-Quality Image Generation with Pixel Space Laplacian Diffusion Models
Paper • 2411.07126 • Published • 31 -
Style-Friendly SNR Sampler for Style-Driven Generation
Paper • 2411.14793 • Published • 40 -
Image Regeneration: Evaluating Text-to-Image Model via Generating Identical Image with Multimodal Large Language Models
Paper • 2411.09449 • Published -
OminiControl: Minimal and Universal Control for Diffusion Transformer
Paper • 2411.15098 • Published • 62
PapersToRead
Memory
Quantization
LLM
-
Does Time Have Its Place? Temporal Heads: Where Language Models Recall Time-specific Information
Paper • 2502.14258 • Published • 26 -
Language Modeling Is Compression
Paper • 2309.10668 • Published • 83 -
Contrastive Decoding Improves Reasoning in Large Language Models
Paper • 2309.09117 • Published • 39 -
Qwen Technical Report
Paper • 2309.16609 • Published • 37
VLM
-
SigLIP 2: Multilingual Vision-Language Encoders with Improved Semantic Understanding, Localization, and Dense Features
Paper • 2502.14786 • Published • 146 -
LongWriter-V: Enabling Ultra-Long and High-Fidelity Generation in Vision-Language Models
Paper • 2502.14834 • Published • 24 -
Qwen2.5-VL Technical Report
Paper • 2502.13923 • Published • 203 -
DICEPTION: A Generalist Diffusion Model for Visual Perceptual Tasks
Paper • 2502.17157 • Published • 53
ImgEdit
Video
ImgGen & Style
Diffusion Models + Means of Improving Quality + Style
-
Edify Image: High-Quality Image Generation with Pixel Space Laplacian Diffusion Models
Paper • 2411.07126 • Published • 31 -
Style-Friendly SNR Sampler for Style-Driven Generation
Paper • 2411.14793 • Published • 40 -
Image Regeneration: Evaluating Text-to-Image Model via Generating Identical Image with Multimodal Large Language Models
Paper • 2411.09449 • Published -
OminiControl: Minimal and Universal Control for Diffusion Transformer
Paper • 2411.15098 • Published • 62