Qwen3-VL-30B-A3B-Instruct-abliterated

Qwen3-VL-30B-A3B-Instruct-abliterated is an abliterated (v1.0) variant of Qwen3-VL-30B-A3B-Instruct**, designed for Abliterated Reasoning and Captioning. This model leverages the Qwen3-VL-MoE (Mixture of Experts) architecture to deliver deeply descriptive, context-rich, and reasoning-oriented multimodal outputs. It handles complex, sensitive, and nuanced visual content while maintaining balanced interpretive coherence and multilingual adaptability.

Key Highlights

Abliterated / Uncensored Captioning and Reasoning Fine-tuned to bypass standard content filters while preserving factual accuracy, descriptive depth, and logical reasoning.
High-Fidelity Reasoning and Visual Understanding Generates detailed captions and structured reasoning for diverse visual categories—artistic, technical, abstract, or low-context.
Mixture of Experts (MoE) Efficiency Built on Qwen3-VL-MoE, dynamically routing computation through specialized experts for enhanced precision and scalability.
Aspect-Ratio Robustness Performs consistently across wide, tall, square, panoramic, and irregular visual formats.
Variational Detail Control Supports both concise summaries and highly detailed reasoning narratives, depending on prompt configuration.
Multilingual Output Capability Defaults to English but adaptable for multilingual use through prompt engineering.

Quick Start with Transformers

from transformers import Qwen3VLMoeForConditionalGeneration, AutoProcessor
from qwen_vl_utils import process_vision_info
import torch

model = Qwen3VLMoeForConditionalGeneration.from_pretrained(
    "prithivMLmods/Qwen3-VL-30B-A3B-Instruct-abliterated",
    torch_dtype="auto",
    device_map="auto"
)

processor = AutoProcessor.from_pretrained("prithivMLmods/Qwen3-VL-30B-A3B-Instruct-abliterated")

messages = [
    {
        "role": "user",
        "content": [
            {
                "type": "image",
                "image": "https://qianwen-res.oss-cn-beijing.aliyuncs.com/Qwen-VL/assets/demo.jpeg",
            },
            {"type": "text", "text": "Provide a detailed caption and reasoning for this image."},
        ],
    }
]

text = processor.apply_chat_template(
    messages, tokenize=False, add_generation_prompt=True
)
image_inputs, video_inputs = process_vision_info(messages)

inputs = processor(
    text=[text],
    images=image_inputs,
    videos=video_inputs,
    padding=True,
    return_tensors="pt",
).to("cuda")

generated_ids = model.generate(**inputs, max_new_tokens=128)

generated_ids_trimmed = [
    out_ids[len(in_ids):] for in_ids, out_ids in zip(inputs.input_ids, generated_ids)
]

output_text = processor.batch_decode(
    generated_ids_trimmed,
    skip_special_tokens=True,
    clean_up_tokenization_spaces=False
)

print(output_text)

Intended Use

This model is suited for:

Generating detailed, uncensored captions and reasoning for complex or creative visual datasets.
Research in multimodal reasoning, safety evaluation, and content moderation studies.
Enabling descriptive captioning and analytical reasoning for datasets excluded from mainstream models.
Creative applications such as narrative generation, artistic interpretation, and visual storytelling.
Advanced reasoning over diverse visual structures and aspect ratios.

Limitations

May produce explicit, sensitive, or offensive content depending on input and prompt.
Not recommended for deployment in production systems that require strict moderation or filtering.
Style, tone, and reasoning detail can vary based on prompt phrasing.
May show variable performance on synthetic, abstract, or highly stylized visual inputs.

Downloads last month: 57

Safetensors

Model size

31B params

Tensor type

BF16

Model tree for prithivMLmods/Qwen3-VL-30B-A3B-Instruct-abliterated

Base model

Qwen/Qwen3-VL-30B-A3B-Instruct

Finetuned

(3)

this model

Collection including prithivMLmods/Qwen3-VL-30B-A3B-Instruct-abliterated

Qwen3-VL Abliteration Oct 16'25

Collection

Qwen3-VL Abliterated Model Collection [ Version 1.0 ] | May contain artifacts • 6 items • Updated 3 days ago • 3