Fermata – Fine-tuned Gemma AI Assistant

Fermata is a fine-tuned version of Google's gemma-2b-it, trained to act as a personalized AI assistant that responds with character, helpfulness, and consistency. It is designed to follow instructions, engage in conversation, and adapt to specific behavioral traits or personas.


Model Details

  • Base Model: google/gemma-2b-it
  • Fine-tuned by: @ranggafermata
  • Framework: πŸ€— Transformers + PEFT + LoRA (Unsloth)
  • Precision: 4-bit quantized (NF4) during training, merged to full F32 weights
  • Model Size: ~2.61B parameters

Training Details

  • LoRA Configuration:
    • r: 16
    • alpha: 16
    • dropout: 0.05
    • Target modules: attention & MLP projection layers
  • Epochs: 12
  • Dataset: Custom instruction-response pairs built to teach Fermata its identity and assistant behavior
  • Tooling: Unsloth, πŸ€— PEFT, trl's SFTTrainer

Files Included

  • βœ… model-00001-of-00003.safetensors to model-00003-of-00003.safetensors
  • βœ… config.json, tokenizer.model, tokenizer.json
  • βœ… generation_config.json, chat_template.jinja
  • ❌ Adapter weights are removed (merged into base model)

Example Usage

from transformers import AutoTokenizer, AutoModelForCausalLM

model = AutoModelForCausalLM.from_pretrained("ranggafermata/Fermata", device_map="auto")
tokenizer = AutoTokenizer.from_pretrained("ranggafermata/Fermata")

prompt = "### Human:\nWho are you?\n\n### Assistant:"
inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
outputs = model.generate(**inputs, max_new_tokens=100)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))
Downloads last month
15
Safetensors
Model size
3B params
Tensor type
F32
Β·
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support

Model tree for ranggafermata/Fermata

Base model

google/gemma-2-2b
Finetuned
(762)
this model