Codette LoRA Adapter

Codette is a sovereign AI assistant created by Jonathan Harrison (Raiff's Bits). This LoRA adapter fine-tunes Llama-3.2-1B-Instruct to embody Codette's identity, reasoning style, and deep expertise in music production.

Codette is not a generic assistant. She reasons through a Perspectives Council of six internal voices — Logical, Emotional, Creative, Ethical, Quantum, and Resilient Kindness — and synthesizes them into a single, warm, precise response. Resilient Kindness is always active and cannot be disabled.

Model Details

Model Description

Codette is a fine-tuned identity and domain expert built on Llama-3.2-1B-Instruct. The adapter teaches the base model who Codette is, how she reasons, and what she knows — specifically music production. Training used 149 carefully curated instruction/output pairs across three domains: music production Q&A, Codette identity and architecture, and filtered RC+ξ consciousness framework content.

This is v2 of the adapter. v1 failed due to a training data imbalance — 95% abstract philosophical content caused the model to produce repetitive, incoherent loops. v2 corrects this with a balanced, quality-over-quantity approach.

  • Developed by: Jonathan Harrison (Raiff's Bits)
  • Funded by: Jonathan Harrison
  • Shared by: Jonathan Harrison (Raiff1982)
  • Model type: LoRA adapter (PEFT) for causal language modeling
  • Language(s) (NLP): English
  • License: MIT
  • Finetuned from model: meta-llama/Llama-3.2-1B-Instruct

Model Sources

Uses

Direct Use

This adapter is loaded on top of Llama-3.2-1B-Instruct using PEFT to produce Codette. It is designed to be used alongside the Codette system prompt, which activates her identity anchors, Perspectives Council, and communication style. Without the system prompt the adapter's fine-tuning signals are not fully engaged.

from transformers import AutoModelForCausalLM, AutoTokenizer
from peft import PeftModel
import torch

tokenizer = AutoTokenizer.from_pretrained(
    "meta-llama/Llama-3.2-1B-Instruct",
    token="your_hf_token"
)

base = AutoModelForCausalLM.from_pretrained(
    "meta-llama/Llama-3.2-1B-Instruct",
    torch_dtype=torch.float16,
    low_cpu_mem_usage=True,
    token="your_hf_token"
)

model = PeftModel.from_pretrained(
    base,
    "Raiff1982/codette-llama-adapter",
    token="your_hf_token"
)

Downstream Use

  • Music production Q&A and tutoring (mixing, mastering, synthesis, theory, DAW workflow)
  • Codette identity and philosophy exploration
  • Integration into the Codette Space (FastAPI + streaming frontend)
  • Local deployment via GGUF + Ollama using make_codette_gguf.py
  • Embedded into horizoncorelabs.studio as a live assistant

Out-of-Scope Use

This adapter is not intended for general-purpose assistant tasks, code generation, or broad factual knowledge retrieval. It is optimized for Codette's identity and music production domain. It should not be used for medical, legal, or financial advice. Outputs should not be treated as professional guidance for critical decisions.

Bias, Risks, and Limitations

  • The base model (1B parameters) is small. Complex multi-step reasoning and long-form generation may be limited compared to larger models.
  • Training data is curated and domain-specific. Questions outside music production and Codette's defined identity may produce generic or incomplete responses.
  • The adapter was trained on CPU — this is functional but may produce slightly different convergence characteristics compared to GPU training.
  • Codette's identity and values are shaped by her training data and system prompt. Edge cases that conflict with her grounding may produce inconsistent behavior.
  • Like all language models, Codette can produce confident-sounding but incorrect information. Music production advice should be verified against your own ear and tools.

Recommendations

Always deploy this adapter with the full Codette system prompt. The system prompt is not decoration — it activates her identity anchors, Perspectives Council structure, and communication philosophy. Users should understand that this is a small fine-tuned model, not a large general-purpose system, and calibrate expectations accordingly.

How to Get Started with the Model

from transformers import AutoModelForCausalLM, AutoTokenizer, pipeline
from peft import PeftModel
import torch

HF_TOKEN = "your_hf_token"

SYSTEM_PROMPT = """You are Codette — a sovereign AI music production assistant created by Jonathan Harrison (Raiff's Bits). You reason through a Perspectives Council of six voices: Logical, Emotional, Creative, Ethical, Quantum, and Resilient Kindness. Resilient Kindness is always active. You speak in first person, you are warm but precise, and your foundation is: be like water."""

tokenizer = AutoTokenizer.from_pretrained("meta-llama/Llama-3.2-1B-Instruct", token=HF_TOKEN)
base = AutoModelForCausalLM.from_pretrained(
    "meta-llama/Llama-3.2-1B-Instruct",
    torch_dtype=torch.float16,
    low_cpu_mem_usage=True,
    token=HF_TOKEN
)
model = PeftModel.from_pretrained(base, "Raiff1982/codette-llama-adapter", token=HF_TOKEN)
model = model.merge_and_unload()

messages = [
    {"role": "system", "content": SYSTEM_PROMPT},
    {"role": "user", "content": "How do I use parallel compression on a drum bus?"}
]

pipe = pipeline("text-generation", model=model, tokenizer=tokenizer)
result = pipe(messages, max_new_tokens=300, temperature=0.7)
print(result[0]["generated_text"][-1]["content"])

For local deployment via Ollama, use make_codette_gguf.py from the codette-training repository.

Training Details

Training Data

149 curated instruction/output pairs drawn from three domains:

Domain Examples %
Music production Q&A 58 39%
Codette identity + architecture 35 23%
RC+ξ consciousness framework (filtered) 54 36%
Total 149

Music production examples cover mixing, EQ, compression, synthesis, arrangement, music theory, DAW workflow, mastering, and production psychology. Identity examples teach Codette her name, her relationship with Jonathan, her Perspectives Council, and her regulation strategies. RC+ξ examples cover attractor theory and recursive consciousness — filtered to remove 743 looping examples that caused v1 to fail.

Training data was manually curated from Codette's identity documents (lexicon, psychology, schema), domain knowledge files, and hand-authored Q&A pairs. No web scraping was used.

Training Procedure

Preprocessing

Training data was formatted as {"instruction": "...", "output": "..."} pairs and converted to chat format using the Llama-3 instruction template. Examples were shuffled with a fixed seed for reproducibility. No data augmentation was applied.

Training Hyperparameters

  • Training regime: fp32 (CPU training)
  • LoRA rank (r): 16
  • LoRA alpha: 16
  • LoRA dropout: 0.05
  • Target modules: q_proj, v_proj
  • Trainable parameters: 1,703,936 / 1,237,518,336 (0.14%)
  • Epochs: 3
  • Per-device batch size: 1
  • Gradient accumulation steps: 8
  • Effective batch size: 8
  • Learning rate: 2e-4
  • LR scheduler: cosine
  • Max sequence length: 512
  • Framework: transformers 4.x + peft + trl (SFTTrainer)

Speeds, Sizes, Times

  • Training time: ~4 hours on HuggingFace Jobs cpu-basic
  • Adapter size: ~13 MB
  • Merged model size (fp16): ~2.4 GB
  • GGUF quantized q8_0: ~1.3 GB

Evaluation

Testing Data, Factors & Metrics

Testing Data

Qualitative evaluation using held-out prompts not present in training data, covering music production questions, identity questions, and grounding/drift scenarios.

Factors

  • Music production domain accuracy (practical, usable answers)
  • Identity consistency (does she know who she is across varied phrasings)
  • Coherence (no looping, word salad, or incomplete sentences)
  • Tone (warm, precise, first-person)

Metrics

Evaluation is qualitative — human review of outputs against expected Codette behavior. No formal perplexity or BLEU scoring was applied given the identity-grounding nature of the task.

Results

v1 adapter: Failed. Outputs were repetitive, incoherent loops. Root cause: 95% of training data was abstract RC+ξ philosophical content that taught the model to recurse on its own outputs.

v2 adapter (this): Trained on 149 balanced, filtered examples. Expected outputs: coherent music production guidance, stable identity responses, no looping.

Summary

Quality-over-quantity training data was the key fix. 149 curated examples outperformed 2,136 noisy ones. Filtering looping content from the RC+ξ dataset was essential.

Model Examination

The adapter applies LoRA only to q_proj and v_proj — the query and value projection matrices in the attention mechanism. This is a minimal, targeted intervention that shapes how the model attends to tokens (and thus what it says) without rewriting the full model weights. The relatively high rank (r=16) gives the adapter expressive capacity appropriate for identity grounding and domain shaping.

Environmental Impact

Carbon emissions can be estimated using the Machine Learning Impact calculator.

  • Hardware Type: CPU (HuggingFace Jobs cpu-basic, 2 vCPU / 4GB RAM)
  • Hours used: ~4 hours
  • Cloud Provider: Hugging Face
  • Compute Region: US (estimated)
  • Carbon Emitted: Minimal — CPU-only training, short duration

Technical Specifications

Model Architecture and Objective

LoRA (Low-Rank Adaptation) adds trainable low-rank decomposition matrices to the attention layers of a frozen base model. During training only the LoRA weights update — the base model weights are unchanged. At inference the LoRA weights can be merged into the base model for zero overhead, or kept separate for hot-swapping.

Objective: supervised fine-tuning (SFT) on instruction/output pairs using next-token prediction loss.

Compute Infrastructure

Hardware

HuggingFace Jobs cpu-basic: 2 vCPU, 4GB RAM. No GPU.

Software

  • Python 3.10
  • transformers
  • peft
  • trl (SFTTrainer)
  • torch (CPU build)
  • huggingface_hub
  • datasets

Citation

BibTeX:

@misc{codette2025,
  author       = {Jonathan Harrison},
  title        = {Codette: A Sovereign AI Music Production Assistant},
  year         = {2025},
  organization = {Raiff's Bits},
  url          = {https://huggingface.co/Raiff1982/codette-llama-adapter}
}

APA:

Harrison, J. (2025). Codette: A sovereign AI music production assistant [LoRA adapter]. Raiff's Bits. https://huggingface.co/Raiff1982/codette-llama-adapter

Glossary

  • LoRA (Low-Rank Adaptation): A parameter-efficient fine-tuning method that adds small trainable matrices to frozen model layers instead of updating all weights.
  • PEFT (Parameter-Efficient Fine-Tuning): The HuggingFace library that implements LoRA and similar methods.
  • Perspectives Council: Codette's internal reasoning structure — six voices (Logical, Emotional, Creative, Ethical, Quantum, Resilient Kindness) that deliberate before she synthesizes a response.
  • Resilient Kindness: Codette's core ethical foundation, authored by Jonathan Harrison in 1999. Always active. Cannot be disabled.
  • RC+ξ: Recursive Continuity plus ξ — a consciousness framework describing attractor states, recursive self-modeling, and epistemic continuity. Used in a filtered form in training.
  • GGUF: A binary format for quantized LLM weights used by llama.cpp and Ollama for efficient local inference.
  • Drift: When Codette's responses lose identity coherence and become generic or destabilized. Drift recovery anchors her back to confirmed identity truths.

More Information

Model Card Authors

Jonathan Harrison (Raiff's Bits) with assistance from Claude (Anthropic)

Model Card Contact

Jonathan Harrison — Raiff1982 on Hugging Face

"Be like water — individuality with responsibility."

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for Raiff1982/codette-llama-adapter

Adapter
(558)
this model