library_name: mlx
license: apache-2.0
license_link: https://huggingface.co/Qwen/Qwen3-4B/blob/main/LICENSE
pipeline_tag: text-generation
base_model: mlx-community/Qwen3-4B-4bit-DWQ-053125
tags:
- mlx
yakusokulabs/dr_qwen_v2
This model yakusokulabs/dr_qwen_v2 was converted to MLX format from mlx-community/Qwen3-4B-4bit-DWQ-053125 using mlx-lm version 0.25.0.
Use with mlx
pip install mlx-lm
Yakusoku Labs โ Dr Qwen v2
A 4-bit, Apple-MLX-ready medical Qwen-4B you can fine-tune and run on a single modern iPhone.
๐งฉ Model Summary
Base | Qwen-3-4B |
Precision | 4-bit NF4 (DWQ) |
Framework | Apple MLX (mlx-lm 0.25.0 ) |
Hardware used | 1 ร Mac mini M4 Pro (16-core GPU) |
Energy / time | โ 14 GPU-hours, ~60 W avg โ 4ร less power than equivalent PyTorch run |
License | Apache 2.0 (weights & code) |
Dr Qwen v2 is purpose-tuned for clinical Q&A, triage and medication counseling while staying light enough for edge devices.
The current checkpoint is finetuned only on public medical datasets; de-identified Indian tele-medicine dialogues will be merged once legal green-lights.
๐ฏ Intended Use & Limitations
Intended
- Medical trivia & exam datasets (MedMCQA, USMLE-style)
- Low-risk symptom triage with human oversight
- Research baseline for Apple-silicon ML pipelines
Out of scope / MUST-NOT
- Autonomous diagnosis or prescription
- High-acuity decision support without a licensed clinician in the loop
- Any use that generates or stores personally identifiable health data (PHI)
๐ Training Data
Corpus | Size | License |
---|---|---|
MedMCQA | 354 k QA | CC-BY-NC-SA-4.0 |
MedQA-USMLE | 13 k QA | MIT |
PubMedQA | 1 k | CC0 |
MMLU-Medical | 1.2 k | MIT |
ChatDoctor Dialogues | 100 k turns | Apache 2.0 |
Planned: +35 k doctor-annotated Indian tele-health Q&A (DPDP-compliant, de-identified).
โ๏ธ Training Procedure
- 3 epochs, batch 128, LR 6e-5, cosine decay, seed 42
- LoRA rank 64 on query/key/value/projection matrices
- Gradient checkpointing & mixed-precision NF4 quant after SFT
- Direct Preference Optimisation (DPO) on synthetic doctor ratings (3 B tokens)
๐ Evaluation
Benchmark (zero-shot) | Base Qwen-4B | Dr Qwen v2 | Llama-3.3-8B |
---|---|---|---|
MedMCQA | 57.8 % | 63.5 % | 64.1 % |
PubMedQA | 48.6 % | 55.2 % | 56.0 % |
Clinician panel (500 simulated consultations via Yakusokuโs multi-agent sandbox)
94 % answers tagged โclinically acceptableโ โ 3 pp shy of human baseline, +9โ12 pp over baselines.
๐ก๏ธ Safety & Responsible AI
- All datasets are public or de-identified; no raw PHI ingested.
ClinGuard-Lite
rule-based filter blocks guideline-violating outputs (e.g., antibiotic over-prescription).- Upcoming blinded trials with IRB oversight (Q3 2025).
- Please keep a licensed clinician in the loop.
๐ Quick Start (Apple MLX)
pip install mlx-lm
from mlx_lm import load, generate
model, tokenizer = load("yakusokulabs/dr_qwen_v2")
prompt = "Patient: I have a mild cough and low-grade fever.\nDoctor:"
if tokenizer.chat_template:
messages = [{"role": "user", "content": prompt}]
prompt = tokenizer.apply_chat_template(messages, add_generation_prompt=True)
print(generate(model, tokenizer, prompt=prompt, verbose=True))