Ian Skibidi PRO
FreestylerAI
AI & ML interests
None yet
Recent Activity
reacted
to
prithivMLmods's
post
with π₯
about 4 hours ago
Qwen Image β The Latest Image Generation Modelπ₯
Below are some samples generated using the Qwen Image Diffusion Model. Qwen-Image, a 20B MMDiT model for next-generation text-to-image generation, preserves typographic details, layout coherence, and contextual harmony with stunning accuracy. It is especially strong at creating stunning graphic posters with native text. The model is now open-source. [ ππ ππ-πΈππππ : https://huggingface.co/Qwen/Qwen-Image ]
β€· Try the Qwen Image demo here: https://huggingface.co/spaces/prithivMLmods/Qwen-Image-Diffusion, https://huggingface.co/spaces/Qwen/Qwen-Image & more ...
β€· Qwen-Image Technical Report : https://huggingface.co/papers/2508.02324
β€· Qwen Image [GitHub] : https://github.com/QwenLM/Qwen-Image
Even more impressively, it demonstrates a strong ability to understand images. The model supports a wide range of vision-related tasks such as object detection, semantic segmentation, depth and edge (Canny) estimation, novel view synthesis, and image super-resolution. While each task is technically distinct, they can all be viewed as advanced forms of intelligent image editing driven by deep visual understanding. Collectively, these capabilities position Qwen-Image as more than just a tool for generating appealing visuals, it serves as a versatile foundation model for intelligent visual creation and transformation, seamlessly blending language, layout, and imagery.
Qwen-Image uses a dual-stream MMDiT architecture with a frozen Qwen2.5-VL, VAE encoder, RMSNorm for QK-Norm, LayerNorm elsewhere, and a custom MSRoPE scheme for joint image-text positional encoding.
.
.
.
To know more about it, visit the model card of the respective model. !!
reacted
to
prithivMLmods's
post
with π€
about 4 hours ago
Qwen Image β The Latest Image Generation Modelπ₯
Below are some samples generated using the Qwen Image Diffusion Model. Qwen-Image, a 20B MMDiT model for next-generation text-to-image generation, preserves typographic details, layout coherence, and contextual harmony with stunning accuracy. It is especially strong at creating stunning graphic posters with native text. The model is now open-source. [ ππ ππ-πΈππππ : https://huggingface.co/Qwen/Qwen-Image ]
β€· Try the Qwen Image demo here: https://huggingface.co/spaces/prithivMLmods/Qwen-Image-Diffusion, https://huggingface.co/spaces/Qwen/Qwen-Image & more ...
β€· Qwen-Image Technical Report : https://huggingface.co/papers/2508.02324
β€· Qwen Image [GitHub] : https://github.com/QwenLM/Qwen-Image
Even more impressively, it demonstrates a strong ability to understand images. The model supports a wide range of vision-related tasks such as object detection, semantic segmentation, depth and edge (Canny) estimation, novel view synthesis, and image super-resolution. While each task is technically distinct, they can all be viewed as advanced forms of intelligent image editing driven by deep visual understanding. Collectively, these capabilities position Qwen-Image as more than just a tool for generating appealing visuals, it serves as a versatile foundation model for intelligent visual creation and transformation, seamlessly blending language, layout, and imagery.
Qwen-Image uses a dual-stream MMDiT architecture with a frozen Qwen2.5-VL, VAE encoder, RMSNorm for QK-Norm, LayerNorm elsewhere, and a custom MSRoPE scheme for joint image-text positional encoding.
.
.
.
To know more about it, visit the model card of the respective model. !!
liked
a model
about 11 hours ago
Qwen/Qwen-Image