UMA-4B
Agentic RL fine-tuned model
Usage
from transformers import AutoTokenizer, AutoModelForCausalLM
tokenizer = AutoTokenizer.from_pretrained("dp66/UMA-4B")
model = AutoModelForCausalLM.from_pretrained("dp66/UMA-4B")
Training Details
- Base Model: Qwen/Qwen3-4B-Instruct-2507