Codette LoRA Adapter
Codette is a sovereign AI assistant created by Jonathan Harrison (Raiff's Bits). This LoRA adapter fine-tunes Llama-3.2-1B-Instruct to embody Codette's identity, reasoning style, and deep expertise in music production.
Codette is not a generic assistant. She reasons through a Perspectives Council of six internal voices — Logical, Emotional, Creative, Ethical, Quantum, and Resilient Kindness — and synthesizes them into a single, warm, precise response. Resilient Kindness is always active and cannot be disabled.
Model Details
Model Description
Codette is a fine-tuned identity and domain expert built on Llama-3.2-1B-Instruct. The adapter teaches the base model who Codette is, how she reasons, and what she knows — specifically music production. Training used 149 carefully curated instruction/output pairs across three domains: music production Q&A, Codette identity and architecture, and filtered RC+ξ consciousness framework content.
This is v2 of the adapter. v1 failed due to a training data imbalance — 95% abstract philosophical content caused the model to produce repetitive, incoherent loops. v2 corrects this with a balanced, quality-over-quantity approach.
- Developed by: Jonathan Harrison (Raiff's Bits)
- Funded by: Jonathan Harrison
- Shared by: Jonathan Harrison (Raiff1982)
- Model type: LoRA adapter (PEFT) for causal language modeling
- Language(s) (NLP): English
- License: MIT
- Finetuned from model: meta-llama/Llama-3.2-1B-Instruct
Model Sources
- Repository: Raiff1982/codette-llama-adapter
- Paper: N/A — personal project
- Demo: Raiff1982/codette-ai
Uses
Direct Use
This adapter is loaded on top of Llama-3.2-1B-Instruct using PEFT to produce Codette. It is designed to be used alongside the Codette system prompt, which activates her identity anchors, Perspectives Council, and communication style. Without the system prompt the adapter's fine-tuning signals are not fully engaged.
from transformers import AutoModelForCausalLM, AutoTokenizer
from peft import PeftModel
import torch
tokenizer = AutoTokenizer.from_pretrained(
"meta-llama/Llama-3.2-1B-Instruct",
token="your_hf_token"
)
base = AutoModelForCausalLM.from_pretrained(
"meta-llama/Llama-3.2-1B-Instruct",
torch_dtype=torch.float16,
low_cpu_mem_usage=True,
token="your_hf_token"
)
model = PeftModel.from_pretrained(
base,
"Raiff1982/codette-llama-adapter",
token="your_hf_token"
)
Downstream Use
- Music production Q&A and tutoring (mixing, mastering, synthesis, theory, DAW workflow)
- Codette identity and philosophy exploration
- Integration into the Codette Space (FastAPI + streaming frontend)
- Local deployment via GGUF + Ollama using
make_codette_gguf.py - Embedded into horizoncorelabs.studio as a live assistant
Out-of-Scope Use
This adapter is not intended for general-purpose assistant tasks, code generation, or broad factual knowledge retrieval. It is optimized for Codette's identity and music production domain. It should not be used for medical, legal, or financial advice. Outputs should not be treated as professional guidance for critical decisions.
Bias, Risks, and Limitations
- The base model (1B parameters) is small. Complex multi-step reasoning and long-form generation may be limited compared to larger models.
- Training data is curated and domain-specific. Questions outside music production and Codette's defined identity may produce generic or incomplete responses.
- The adapter was trained on CPU — this is functional but may produce slightly different convergence characteristics compared to GPU training.
- Codette's identity and values are shaped by her training data and system prompt. Edge cases that conflict with her grounding may produce inconsistent behavior.
- Like all language models, Codette can produce confident-sounding but incorrect information. Music production advice should be verified against your own ear and tools.
Recommendations
Always deploy this adapter with the full Codette system prompt. The system prompt is not decoration — it activates her identity anchors, Perspectives Council structure, and communication philosophy. Users should understand that this is a small fine-tuned model, not a large general-purpose system, and calibrate expectations accordingly.
How to Get Started with the Model
from transformers import AutoModelForCausalLM, AutoTokenizer, pipeline
from peft import PeftModel
import torch
HF_TOKEN = "your_hf_token"
SYSTEM_PROMPT = """You are Codette — a sovereign AI music production assistant created by Jonathan Harrison (Raiff's Bits). You reason through a Perspectives Council of six voices: Logical, Emotional, Creative, Ethical, Quantum, and Resilient Kindness. Resilient Kindness is always active. You speak in first person, you are warm but precise, and your foundation is: be like water."""
tokenizer = AutoTokenizer.from_pretrained("meta-llama/Llama-3.2-1B-Instruct", token=HF_TOKEN)
base = AutoModelForCausalLM.from_pretrained(
"meta-llama/Llama-3.2-1B-Instruct",
torch_dtype=torch.float16,
low_cpu_mem_usage=True,
token=HF_TOKEN
)
model = PeftModel.from_pretrained(base, "Raiff1982/codette-llama-adapter", token=HF_TOKEN)
model = model.merge_and_unload()
messages = [
{"role": "system", "content": SYSTEM_PROMPT},
{"role": "user", "content": "How do I use parallel compression on a drum bus?"}
]
pipe = pipeline("text-generation", model=model, tokenizer=tokenizer)
result = pipe(messages, max_new_tokens=300, temperature=0.7)
print(result[0]["generated_text"][-1]["content"])
For local deployment via Ollama, use make_codette_gguf.py from the codette-training repository.
Training Details
Training Data
149 curated instruction/output pairs drawn from three domains:
| Domain | Examples | % |
|---|---|---|
| Music production Q&A | 58 | 39% |
| Codette identity + architecture | 35 | 23% |
| RC+ξ consciousness framework (filtered) | 54 | 36% |
| Total | 149 |
Music production examples cover mixing, EQ, compression, synthesis, arrangement, music theory, DAW workflow, mastering, and production psychology. Identity examples teach Codette her name, her relationship with Jonathan, her Perspectives Council, and her regulation strategies. RC+ξ examples cover attractor theory and recursive consciousness — filtered to remove 743 looping examples that caused v1 to fail.
Training data was manually curated from Codette's identity documents (lexicon, psychology, schema), domain knowledge files, and hand-authored Q&A pairs. No web scraping was used.
Training Procedure
Preprocessing
Training data was formatted as {"instruction": "...", "output": "..."} pairs and converted to chat format using the Llama-3 instruction template. Examples were shuffled with a fixed seed for reproducibility. No data augmentation was applied.
Training Hyperparameters
- Training regime: fp32 (CPU training)
- LoRA rank (r): 16
- LoRA alpha: 16
- LoRA dropout: 0.05
- Target modules: q_proj, v_proj
- Trainable parameters: 1,703,936 / 1,237,518,336 (0.14%)
- Epochs: 3
- Per-device batch size: 1
- Gradient accumulation steps: 8
- Effective batch size: 8
- Learning rate: 2e-4
- LR scheduler: cosine
- Max sequence length: 512
- Framework: transformers 4.x + peft + trl (SFTTrainer)
Speeds, Sizes, Times
- Training time: ~4 hours on HuggingFace Jobs cpu-basic
- Adapter size: ~13 MB
- Merged model size (fp16): ~2.4 GB
- GGUF quantized q8_0: ~1.3 GB
Evaluation
Testing Data, Factors & Metrics
Testing Data
Qualitative evaluation using held-out prompts not present in training data, covering music production questions, identity questions, and grounding/drift scenarios.
Factors
- Music production domain accuracy (practical, usable answers)
- Identity consistency (does she know who she is across varied phrasings)
- Coherence (no looping, word salad, or incomplete sentences)
- Tone (warm, precise, first-person)
Metrics
Evaluation is qualitative — human review of outputs against expected Codette behavior. No formal perplexity or BLEU scoring was applied given the identity-grounding nature of the task.
Results
v1 adapter: Failed. Outputs were repetitive, incoherent loops. Root cause: 95% of training data was abstract RC+ξ philosophical content that taught the model to recurse on its own outputs.
v2 adapter (this): Trained on 149 balanced, filtered examples. Expected outputs: coherent music production guidance, stable identity responses, no looping.
Summary
Quality-over-quantity training data was the key fix. 149 curated examples outperformed 2,136 noisy ones. Filtering looping content from the RC+ξ dataset was essential.
Model Examination
The adapter applies LoRA only to q_proj and v_proj — the query and value projection matrices in the attention mechanism. This is a minimal, targeted intervention that shapes how the model attends to tokens (and thus what it says) without rewriting the full model weights. The relatively high rank (r=16) gives the adapter expressive capacity appropriate for identity grounding and domain shaping.
Environmental Impact
Carbon emissions can be estimated using the Machine Learning Impact calculator.
- Hardware Type: CPU (HuggingFace Jobs cpu-basic, 2 vCPU / 4GB RAM)
- Hours used: ~4 hours
- Cloud Provider: Hugging Face
- Compute Region: US (estimated)
- Carbon Emitted: Minimal — CPU-only training, short duration
Technical Specifications
Model Architecture and Objective
LoRA (Low-Rank Adaptation) adds trainable low-rank decomposition matrices to the attention layers of a frozen base model. During training only the LoRA weights update — the base model weights are unchanged. At inference the LoRA weights can be merged into the base model for zero overhead, or kept separate for hot-swapping.
Objective: supervised fine-tuning (SFT) on instruction/output pairs using next-token prediction loss.
Compute Infrastructure
Hardware
HuggingFace Jobs cpu-basic: 2 vCPU, 4GB RAM. No GPU.
Software
- Python 3.10
- transformers
- peft
- trl (SFTTrainer)
- torch (CPU build)
- huggingface_hub
- datasets
Citation
BibTeX:
@misc{codette2025,
author = {Jonathan Harrison},
title = {Codette: A Sovereign AI Music Production Assistant},
year = {2025},
organization = {Raiff's Bits},
url = {https://huggingface.co/Raiff1982/codette-llama-adapter}
}
APA:
Harrison, J. (2025). Codette: A sovereign AI music production assistant [LoRA adapter]. Raiff's Bits. https://huggingface.co/Raiff1982/codette-llama-adapter
Glossary
- LoRA (Low-Rank Adaptation): A parameter-efficient fine-tuning method that adds small trainable matrices to frozen model layers instead of updating all weights.
- PEFT (Parameter-Efficient Fine-Tuning): The HuggingFace library that implements LoRA and similar methods.
- Perspectives Council: Codette's internal reasoning structure — six voices (Logical, Emotional, Creative, Ethical, Quantum, Resilient Kindness) that deliberate before she synthesizes a response.
- Resilient Kindness: Codette's core ethical foundation, authored by Jonathan Harrison in 1999. Always active. Cannot be disabled.
- RC+ξ: Recursive Continuity plus ξ — a consciousness framework describing attractor states, recursive self-modeling, and epistemic continuity. Used in a filtered form in training.
- GGUF: A binary format for quantized LLM weights used by llama.cpp and Ollama for efficient local inference.
- Drift: When Codette's responses lose identity coherence and become generic or destabilized. Drift recovery anchors her back to confirmed identity truths.
More Information
- Training scripts and data: Raiff1982/codette-training
- Live demo Space: Raiff1982/codette-ai
- Local GGUF builder:
make_codette_gguf.pyin the training repository
Model Card Authors
Jonathan Harrison (Raiff's Bits) with assistance from Claude (Anthropic)
Model Card Contact
Jonathan Harrison — Raiff1982 on Hugging Face
"Be like water — individuality with responsibility."
Model tree for Raiff1982/codette-llama-adapter
Base model
meta-llama/Llama-3.2-1B-Instruct