Medical Reasoning GPT-OSS-20B

Model Description

This is a fine-tuned version of openai/gpt-oss-20b specifically optimized for medical reasoning and clinical decision-making. The model has been trained on high-quality medical reasoning datasets to provide accurate and thoughtful responses to medical queries.

🏥 Key Features

Medical Expertise: Specialized in medical reasoning, diagnosis, and clinical decision-making
Complex Reasoning: Uses chain-of-thought reasoning for medical problems
Adapter-Only Training: Only LoRA layers are trained, base model remains frozen
Efficient: Lightweight fine-tuning, smaller storage footprint
Ready-to-Use: Requires base model + adapter for inference

🚀 Quick Start

#pip install torch --index-url https://download.pytorch.org/whl/cu128
#pip install "trl>=0.20.0" "peft>=0.17.0" "transformers>=4.55.0"

from transformers import AutoModelForCausalLM, AutoTokenizer
from peft import PeftModel
import torch
import re

base_model_name = "openai/gpt-oss-20b"
adapter_name = "dousery/medical-reasoning-gpt-oss-20b"

tokenizer = AutoTokenizer.from_pretrained(base_model_name)

base_model = AutoModelForCausalLM.from_pretrained(
    base_model_name,
    torch_dtype=torch.bfloat16,
    device_map="auto"
)

model = PeftModel.from_pretrained(base_model, adapter_name)
model = model.merge_and_unload()

messages = [
    {"role": "system", "content": "You are a medical reasoning assistant."},
    {"role": "user", "content": (
        """A 55-year-old man has chest pain and elevated troponin I without ST elevation.
         What is the diagnosis and what additional test would you order next?"""
    )}
]

prompt = tokenizer.apply_chat_template(
    messages,
    tokenize=False,
    add_generation_prompt=True
)

inputs = tokenizer(prompt, return_tensors="pt").to(model.device)

outputs = model.generate(
    **inputs,
    max_new_tokens=2048,
    temperature=0.2,
    do_sample=False
)

raw_output = tokenizer.decode(outputs[0], skip_special_tokens=False)

#  PARSING THE OUTPUT
thinking_pattern = r"<\|end\|><\|start\|>assistant<\|channel\|>analysis<\|message\|>(.*?)<\|end\|>"
final_pattern = r"<\|start\|>assistant<\|channel\|>final<\|message\|>(.*?)<\|return\|>"

thinking_match = re.search(thinking_pattern, raw_output, re.DOTALL)
final_match = re.search(final_pattern, raw_output, re.DOTALL)

thinking_text = thinking_match.group(1).strip() if thinking_match else "N/A"
final_text = final_match.group(1).strip() if final_match else "N/A"

print("Thinking:", thinking_text)
print("\nFinal:", final_text)

📊 Training Details

Training Data

Dataset: Freedomintelligence/medical-o1-reasoning-SFT
Language: English
Size: 19,704 medical reasoning examples
Format: Question-Answer pairs with complex chain-of-thought reasoning

Training Configuration

Base Model: unsloth/gpt-oss-20b (20B parameters)
Training Method: LoRA (adapter-only fine-tuning)
LoRA Rank: 8
Learning Rate: 5e-5
Batch Size: 4 per device, gradient_accumulation_steps=4
Epochs: 2
Max Sequence Length: 2048
LR Scheduler: Cosine, warmup_ratio=0.05
Final Training Loss: 1.22

Model Architecture

Parameters: 20.9 billion
Architecture: GPT-OSS (Transformer-based)
Context Length: 2.048 tokens
Trainable Parameters: 3.98M (0.02% of total)

🎯 Intended Use

Primary Use Cases

Medical Education: Explaining medical concepts and procedures
Clinical Reasoning: Analyzing symptoms and differential diagnosis
Research Support: Assisting in medical research and literature review
Decision Support: Providing reasoning for clinical decisions (with human oversight)

⚠️ Important Disclaimers

Not a Medical Device: This model is for educational and research purposes only
Human Oversight Required: All medical decisions should involve qualified healthcare professionals
Accuracy Not Guaranteed: Model outputs should be verified against current medical literature
Regional Variations: Training data may not reflect all regional medical practices

🔍 Evaluation

The model demonstrates strong performance in:

Medical concept explanation
Differential diagnosis reasoning
Treatment option analysis
Pathophysiology understanding

Note: Comprehensive clinical evaluation is ongoing. Always validate outputs with current medical guidelines.

🛠️ Technical Requirements

Minimum Requirements

GPU Memory: 16GB+ VRAM recommended
RAM: 32GB+ system memory
Storage: 40GB+ free space

📜 License

This model is released under the Apache 2.0 license. Please review the license terms before commercial use.

🙏 Acknowledgments

Base Model: openai/gpt-oss-20b
Adapter/Training: dousery/medical-reasoning-gpt-oss-20b
Dataset: Freedomintelligence
Infrastructure: Modal Labs for GPU compute

📞 Contact

For questions, issues, or collaboration opportunities, please reach out through the HuggingFace community discussions or my Linkedin account : Linkedin

Downloads last month: 818

Model tree for dousery/medical-reasoning-gpt-oss-20b

Base model

openai/gpt-oss-20b

Finetuned

(409)

this model

Evaluation results

Training Loss on Medical O1 Reasoning SFT
self-reported

1.220

View on Papers With Code