Model Card for Fitness Agent (14B-Qwen2.5)
This is a fine-tuned LoRA adapter for unsloth/qwen2.5-14b-instruct-unsloth-bnb-4bit, trained to act as a specialized Fitness & Nutrition Agent. The model was trained using Group Relative Policy Optimization (GRPO) to improve its reasoning capabilities in creating personalized workout plans, analyzing nutrition logs, and providing evidence-based health advice.
Model Details
Model Description
This model is an RL-finetuned version of Qwen 2.5 14B designed to solve complex fitness and nutrition queries. Unlike standard LLMs, this agent was trained with specific rewards for:
- Reasoning Quality: Producing logical, step-by-step explanations for its recommendations.
- Safety & Constraints: Strictly adhering to dietary restrictions (allergies, preferences) and physical limitations.
- Format Compliance: Generating structured JSON outputs for workout plans and diet logs when required.
It uses the LangGraph framework to manage agent state and tool invocation during training.
- Developed by: socaitcy
- Funded by [optional]: Self-funded
- Model type: LoRA Adapter (Fine-tuned Causal LM)
- Language(s) (NLP): English
- License: Apache 2.0
- Finetuned from model: unsloth/qwen2.5-14b-instruct-unsloth-bnb-4bit
Model Sources [optional]
Uses
Direct Use
This model is intended to be used as a conversational assistant or API backend for:
- Generating personalized weekly workout routines.
- Calculating macronutrient needs based on user stats.
- Answering questions about exercise form and dietary science.
Downstream Use [optional]
Integrated into the fitness-reasoning-rl-agent system, where it can call external tools (search, database lookups) to augment its answers with real-time data.
Out-of-Scope Use
- Medical Advice: This model is for fitness and wellness coaching only. It is not a substitute for professional medical advice, diagnosis, or treatment.
- Extreme Diets: The model should not be used to generate dangerous or extreme weight loss protocols.
Bias, Risks, and Limitations
- Hallucination: Like all LLMs, it can occasionally invent facts or exercises that do not exist.
- Knowledge Cutoff: Its knowledge is limited to the base model's training data plus the fine-tuning dataset; it may not know the very latest fitness trends unless provided via context.
- User Physiology: It relies on user-provided data (weight, age, etc.) and cannot verify physical health status.
Recommendations
Users should always consult with a physician before starting any new exercise or nutrition program generated by this model.
How to Get Started with the Model
Use the code below to get started with the model.
from peft import PeftModel, PeftConfig from transformers import AutoModelForCausalLM, AutoTokenizer
config = PeftConfig.from_pretrained("socaitcy/fitness-agent-14B-qwen2.5-adapter") base_model = AutoModelForCausalLM.from_pretrained("unsloth/qwen2.5-14b-instruct-unsloth-bnb-4bit", device_map="auto", load_in_4bit=True) model = PeftModel.from_pretrained(base_model, "socaitcy/fitness-agent-14B-qwen2.5-adapter") tokenizer = AutoTokenizer.from_pretrained("unsloth/qwen2.5-14b-instruct-unsloth-bnb-4bit")
prompt = "Create a 3-day workout plan for a beginner with no equipment." inputs = tokenizer(prompt, return_tensors="pt").to("cuda") outputs = model.generate(**inputs, max_new_tokens=512) print(tokenizer.decode(outputs[0], skip_special_tokens=True))## Training Details
Training Data
The model was trained on a custom dataset of fitness scenarios (data/fitness_scenarios.jsonl), including:
- Synthetic user profiles with specific goals (e.g., "Lose 5kg", "Marathon prep").
- Validated nutritional constraints (e.g., "Vegan", "Gluten-free").
- Correct vs. incorrect workout split logic.
Training Procedure
Preprocessing [optional]
Data was formatted into specific prompt templates used by the agent system to simulate user interactions.
Training Hyperparameters
- Training regime: Mixed precision (bf16) with LoRA (Rank=8, Alpha=16).
- Optimizer: AdamW 8-bit
- Method: GRPO (Group Relative Policy Optimization)
- Quantization: 4-bit (BitsAndBytes)
Environmental Impact
- Hardware Type: NVIDIA GPU (e.g., H100/A100/4090)
- Hours used: ~2-10 hours (Estimated)
- Cloud Provider: Private / Local
- Compute Region: Local
Citation [optional]
BibTeX:
@misc{fitness-agent-2025, author = {socaitcy}, title = {Fitness Agent 14B (Qwen2.5 LoRA)}, year = {2025}, publisher = {Hugging Face}, journal = {Hugging Face Repository}, howpublished = {\url{https://huggingface.co/socaitcy/fitness-agent-14B-qwen2.5-adapter}} }### Framework versions
- PEFT 0.18.0
- Transformers
- Unsloth
- TRL
- Downloads last month
- 4