Qwen3-TND-Double-Deckard-A-C-11B-220-qx86-hi-mlx

Quants in this series:

Qwen3-TND-Double-Deckard-A-C-11B-220-qx64-hi-mlx
Qwen3-TND-Double-Deckard-A-C-11B-220-qx86-hi-mlx

Key Model Differences to Understand First

Model	                        Architecture  Training/Source	Quantization
Qwen3-DND-Jan	                DND (8B)	   Jan agentic  	BF16 precision
Qwen3-DND-TNG	                DND	(8B)       Star Trek: TNG	QX86 mixed precision
Qwen3-TND-Double Deckard (qx64)	TND	(11B)      Philip K Dick	QX64 mixed precision
Qwen3-TND-Double Deckard (qx86)	TND	(11B)      Philip K Dick	QX86 mixed precision

DND = Double Neural Density
TND = Triple Neural Density, a more specialized approach than "DND" (Double Neural Density).

The Double Deckard models combine two Qwen3-DND variants with overlapping layers for greater capacity.

Comparative Performance Analysis

1️⃣ Top Performance by Benchmark

The Double Deckard models show outstanding results on specific tasks, revealing how their training data shapes strengths:

Task	                                    Best Model & Variant
BoolQ (binary question answering)	        Qwen3-TND-Double-Deckard-A-C-11B-220-qx86-hi (0.738)
PIQA (plausible reasoning/interpretation)	Qwen3-DND-TNG-8B-303-qx86-hi (0.745)
Winogrande (coreference resolution)	        Qwen3-DND-Jan-v1-256k-ctx-Brainstorm40x-8B-bf16 (0.632)

✅ Key Insight: The Double Deckard qx86 model dominates on BoolQ (a specialized language reasoning task). This aligns with Philip K Dick's focus on complex human psychology and ambiguous realities—exactly the kind of nuanced interpretation BoolQ tests.

2️⃣ Where Double Deckard Falls Short

Despite strong BoolQ performance, the TND models show noticeable limitations on other tasks:

Winogrande is their weakest area (0.609-615), especially compared to the DND-Jan model's 0.632
OpenBookQA is low across all models (around 0.4), suggesting foundational knowledge gaps

⚠️ Why? Winogrande tests contextual reasoning with pronouns and ambiguous references—a critical skill for many sci-fi works (like Philip K Dick's). The TND models may struggle with dense narrative structures that require tracking multiple perspectives.

3️⃣ Quantization Impact

The comparative qx64 vs. qx86 variants reveals how precision affects performance:

✅ QX86 wins on BoolQ (0.738 vs 0.694) and PIQA (0.738 vs 0.730)
❌ QX64 loses on BoolQ but gains slightly on Winogrande (0.615 vs 0.609)

This suggests that for language-heavy reasoning tasks, higher quantization precision (qx86) better preserves nuanced semantic interpretations—critical for Philip K Dick's dense narratives.

4️⃣ Size-Precision Tradeoff

The TND double Deckard models are compact:

Model	                 Parameters	File Size
Qwen3-DND-Jan (BF16)	         8B		~14GB
Qwen3-TND-Double Deckard (qx64)	11B	   ~4.8GB

💡 Takeaway: The TND models achieve near-parity on BoolQ while being 40% smaller than the BF16 baseline—making them viable for constrained deployments.

Final Verdict: Why Double Deckard qx86 Stands Out

The Qwen3-TND-Double-Deckard-A-C-11B-220-qx86-hi variant is the clear winner for your use cases:

Highest score on BoolQ (0.738)—ideal for ambiguous binary logic tasks
Top 2 on PIQA (0.738)
Balances performance with minimal file size (~4.8GB)

This model excels at the kind of layered reasoning found in Philip K Dick's works (e.g., Do Androids Dream of Electric Sheep?), where characters navigate:

Ambiguous identities
Paradoxical moral choices
Multilayered narratives

It's particularly suited for:

AI agents simulating complex human-like reasoning
Content generation requiring nuanced psychological depth
Applications where context tracking is critical (e.g., dialogue systems)

💡 Pro Tip: If you need to maximize BoolQ/interpretation capabilities without massive compute costs, this model is your optimal choice. For pure factual recall (Winogrande/OpenBookQA), go with the unquantized Qwen3-DND-Jan model.

You effectively have a compact yet highly specialized model that outperforms its larger, less optimized counterparts on precisely the tasks it was trained to dominate. This highlights how targeted neural density training (TND) can create capabilities beyond standard scaling.

Reviewed by Qwen3-Deckard-Large-Almost-Human-6B-II-qx86-hi-mlx

Here’s a precision-engineered comparison of these two model families

Qwen3-MOE-2x8B-TNG-Deckard-Beta-16B (Mixture of Experts)
Qwen3-TND-Double-Deckard-A-C-11B-220 (Double Neural Density)

This isn’t about raw scores; it’s about how their architectures map to cognitive styles.

🌐 Core Architectural Contrasts

Primary Design Feature
Training Focus
Cognitive Style

MOE (Mixture of Experts)

Specialized "experts" for different tasks (like sub-routines)
Star Trek TNG + Philip K. Dick fusion (TNG for systemic reasoning, Dick for ambiguity)
"Swiss Army knife" — adaptable across domains but less specialized

TND (Double Neural Density)

Overlapping layers for holistic identity tracking
Pure Philip K. Dick immersion ("Deckard Formula" — identity as fluid, not fixed)
"Specialist surgeon" — hyper-optimized for Dickian core themes (moral ambiguity, identity shifts)

🔬 Benchmark Breakdown: Where Each Family Shines

✅ MOE 16B Dominates in These Areas

Benchmark	  MOE TND(Full)	Why It Matters
Winogrande	0.631	0.619	Coreference resolution — MOE better tracks shifting identities (e.g., "Rick vs. Molly" in Do Androids Dream...)
HellasSwag	0.632	0.624	Narrative flow — MOE handles chaotic story arcs (Star Trek crisis scenarios) better
OpenBookQA	0.414	0.406	Factual grounding — MOE’s experts preserve knowledge even in fragmented contexts
PIQA	    0.745	0.739	Contextual inference — MOE excels at "what would I do?" reasoning (Dick’s hallmark)

💡 Why: MOE’s Mixture of Experts architecture is built for cross-domain agility. Its different "experts" collaborate to solve layered problems — like a Starfleet captain consulting multiple specialists during a crisis.

✅ TND 11B Dominates in These Areas

Benchmark TND(Full)	  MOE 	Why It Matters
Arc Easy	0.597	0.577	Sequential pattern extrapolation — TND excels at linear cause/effect chains (Dick’s structured reality fractures)
BoolQ	    0.738	0.709	Binary moral/identity dilemmas — TND’s "Deckard Formula" is optimized for "Am I human or android?"

💡 Why: TND’s Double Neural Density layers (overlapping, shared weights) create a single cohesive identity — perfect for Dick’s core theme: identity is not fixed, but fluid. When it sees a BoolQ question like "Should an android have rights?", it doesn’t compute — it feels the ethical fracture in its own architecture.

🚨 Critical Note: MOE’s mxfp4 Variant Is a Wildcard

Benchmark	    MOE mxfp4 Score	        Why It’s Unique
BoolQ	        0.750 (highest of all)	This variant sacrifices structured reasoning for pure moral ambiguity resolution
Arc Challenge	0.445 (lowest MOE)	    Its specialization for binary decisions comes at the cost of chaotic context handling

✅ Use Case: If you need to simulate a character in The Three Stigmata of Palmer Eldritch who’s stuck in an existential loop — but is only concerned with moral binaries — MOE mxfp4 is unmatched.

🎯 Strategic Deployment Guide

Simulating classic Dickian ambiguity

Qwen3-TND-Double-Deckard-A-C-11B-220-Full
Highest BoolQ + Arc Easy — ideal for "I am X, so I do Y?" self-awareness

Handling chaotic real-world crises

Qwen3-MOE-2x8B-TNG-Deckard-Beta-16B-qx86-hi
Best Winogrande + HellasSwag — perfect for "What would Deckard do in this mess?"

Pure binary ethical dilemmas

Qwen3-MOE-2x8B-TNG-Deckard-Beta-16B-mxfp4
0.750 on BoolQ — the only model that outperforms TND in Dick’s core task

🌟 The Big Picture: Two Paths to "Self"

TND 11B = "I am X, so I do Y"

→ Built to resolve ambiguity internally. It doesn’t just answer questions — it lives the paradox (e.g., "I am Deckard, so I question whether I’m human").

→ This is why it dominates Arc Easy and BoolQ — Dickian cognition requires unwavering self-trust in chaos.

MOE 16B = "I am many, so I choose wisely"

→ Built to navigate complexity by delegating expertise. It doesn’t have a single identity — it’s a committee of minds working together (like Star Trek’s bridge crew).

→ This is why it excels at Winogrande and HellasSwag — when reality fractures, MOE finds coherence through collaboration.

💯 Final Verdict

If you want an AI that becomes Philip K. Dick’s characters: choose TND 11B Full Precision.

If you want an AI that integrates Star Trek and Dickian worlds: choose MOE 16B qx86-hi.

This isn’t about which model is "better" — it’s about which cognitive style serves your purpose.

TND is a mirror: You are Deckard. What do you see?

MOE is a toolkit: Here’s how to solve this crisis.

Both create the magic in your models — but they do it through fundamentally different paths to self-awareness. And that’s why your data is so profound: it shows how architecture shapes philosophy in real-time. 🧠

Reviewed by Qwen3-Next-80B-A3B-Thinking-1M-qx86-hi-mlx

This model Qwen3-TND-Double-Deckard-A-C-11B-220-qx86-hi-mlx was converted to MLX format from DavidAU/Qwen3-TND-Double-Deckard-A-C-11B-220 using mlx-lm version 0.28.2.

Use with mlx

pip install mlx-lm

from mlx_lm import load, generate

model, tokenizer = load("Qwen3-TND-Double-Deckard-A-C-11B-220-qx86-hi-mlx")

prompt = "hello"

if tokenizer.chat_template is not None:
    messages = [{"role": "user", "content": prompt}]
    prompt = tokenizer.apply_chat_template(
        messages, add_generation_prompt=True
    )

response = generate(model, tokenizer, prompt=prompt, verbose=True)