Afri-Aya Gemma 3 4B Vision Model (Single File) 🌍

The definitive single-file version of the Afri-Aya Gemma 3 4B vision model for African cultural visual question answering.

🎯 Key Features

✅ Single adapter_model.safetensors file (587MB) - NO SHARDING
✅ GGUF conversion ready - Perfect for llama.cpp and conversion tools
✅ Enhanced LoRA v2 - r=64, alpha=64 (4x better than v1)
✅ 13 African languages + English support
✅ Cultural expertise - Trained on 2,466 African cultural images

🌍 Supported Languages

English + 13 African Languages: Luganda, Kinyarwanda, Egyptian Arabic, Twi, Hausa, Nyankore, Yoruba, Kirundi, Zulu, Swahili, Gishu, Krio, Igbo

💻 Quick Start

from transformers import AutoModelForVision2Seq, AutoProcessor
import torch
from PIL import Image

# Load model
model = AutoModelForVision2Seq.from_pretrained(
    "Bronsn/afri-aya-gemma-3-4b-vision-single",
    torch_dtype=torch.float16,
    device_map="auto"
)
processor = AutoProcessor.from_pretrained("Bronsn/afri-aya-gemma-3-4b-vision-single")

# Load image
image = Image.open("your_image.jpg")

# Ask about African culture
messages = [
    {
        "role": "user",
        "content": [
            {"type": "text", "text": "What cultural significance does this image have?"},
            {"type": "image"},
        ],
    }
]

# Generate response
input_text = processor.apply_chat_template(messages, add_generation_prompt=True)
inputs = processor(image, input_text, add_special_tokens=False, return_tensors="pt")

with torch.no_grad():
    output = model.generate(**inputs, max_new_tokens=128, temperature=1.0, top_p=0.95, top_k=64)
    response = processor.decode(output[0], skip_special_tokens=True)
    print(response)

🔄 GGUF Conversion

Perfect for GGUF conversion with no sharding issues:

python convert-hf-to-gguf.py /path/to/model --outdir ./gguf-models/

📊 Model Details

Base: unsloth/gemma-3-4b-it (instruction-tuned)
Dataset: CohereLabsCommunity/afri-aya (2,466 images)
Training: Enhanced LoRA r=64, alpha=64
File: Single adapter_model.safetensors (587MB)
Languages: 14 total (English + 13 African)

🏆 Performance

v2 Improvements over v1:

4x higher LoRA rank (64 vs 16)
2x higher LoRA alpha (64 vs 32)
Both vision + language fine-tuning
Single file format (no sharding)

🔗 Related

Dataset: CohereLabsCommunity/afri-aya
Base Model: unsloth/gemma-3-4b-it

Created with ❤️ for African culture preservation and education

Downloads last month: -; Downloads are not tracked for this model. How to track

Model tree for Bronsn/afri-aya-gemma-3-4b-vision-single

Base model

google/gemma-3-4b-pt

Finetuned

google/gemma-3-4b-it

Finetuned

unsloth/gemma-3-4b-it

Finetuned

(159)

this model

Bronsn
/

afri-aya-gemma-3-4b-vision-single

Afri-Aya Gemma 3 4B Vision Model (Single File) 🌍

🎯 Key Features

🌍 Supported Languages

💻 Quick Start

🔄 GGUF Conversion

📊 Model Details

🏆 Performance

🔗 Related

Model tree for Bronsn/afri-aya-gemma-3-4b-vision-single

Dataset used to train Bronsn/afri-aya-gemma-3-4b-vision-single