Harry Potter Multi-Persona QA Model V2 with 60K QA SFT (LoRA Adapter)

This is a LoRA adapter for google/gemma-2b fine-tuned for Harry Potter universe question-answering with multiple character personas.

Supported Personas

  • Hermione Granger: Analytical, articulate, and deeply knowledgeable
  • Harry Potter: Courageous, emotional, and direct
  • Severus Snape: Cold, precise, and cutting
  • Voldemort: Grandiose, manipulative, and authoritative
  • David Goggins: Raw, gritty, and brutally honest
  • Donald Trump: Boastful, blunt, and repetitive
  • General: Expert on Harry Potter universe

Usage

from transformers import AutoTokenizer, AutoModelForCausalLM, BitsAndBytesConfig
from peft import PeftModel
import torch

# Load base model
base_model_name = "google/gemma-2b"
tokenizer = AutoTokenizer.from_pretrained(base_model_name)

bnb_config = BitsAndBytesConfig(
load_in_4bit=True,
bnb_4bit_compute_dtype=torch.float16,
bnb_4bit_quant_type="nf4",
)

base_model = AutoModelForCausalLM.from_pretrained(
base_model_name,
quantization_config=bnb_config,
device_map="auto"
)

# Load adapter
model = PeftModel.from_pretrained(base_model, "Jayeshbankoti/potterhead_gpt")

# Format prompt
def format_prompt(question, persona="general"):
if persona.lower() == "general":
    system_prompt = "You are an expert on the Harry Potter universe."
else:
    system_prompt = f"You are {persona}. Respond in their unique tone and worldview."

return f"<bos><start_of_turn>system\n{system_prompt}<end_of_turn>\n<start_of_turn>user\n{question}<end_of_turn>\n<start_of_turn>model\n"

# Generate answer
question = "What is the significance of the Patronus charm?"
persona = "Hermione Granger"
prompt = format_prompt(question, persona)

inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
outputs = model.generate(**inputs, max_new_tokens=512, temperature=0.7)
answer = tokenizer.decode(outputs[0][inputs['input_ids'].shape[1]:], skip_special_tokens=True)
print(answer)

Training Details

  • Base Model: google/gemma-2b
  • Training Method: Supervised Fine-Tuning (SFT) with LoRA
  • Dataset: Custom Harry Potter QA dataset with persona-specific responses
  • Max Sequence Length: 512 tokens
  • Batch Size: 4 (with gradient accumulation)

Model Details

  • Developed by: Jayesh Bankoti
  • Model type: Causal Language Model with LoRA adapter
  • Language(s): English
  • Finetuned from model: google/gemma-2b
Downloads last month
3
Inference Providers NEW
This model isn't deployed by any Inference Provider. 馃檵 3 Ask for provider support

Model tree for Jayeshbankoti/potterhead_gpt

Base model

google/gemma-2b
Adapter
(23554)
this model