Elbaz-OLMo-3-32B-Think-Abliterated
Model Description
This model is an abliterated version of allenai/OLMo-3-32B-Think that has had its refusal mechanisms removed using our advanced SNR-based Layer Selection with Norm-Preserving Orthogonalization method. This technique identifies the optimal layers for abliteration using signal-to-noise ratio analysis and applies norm-preserving modifications to maintain model coherence while maximizing refusal removal. The model will respond to prompts that the original model would refuse.
OLMo-3-32B-Think is a 32B parameter reasoning model from Allen AI that uses extended thinking (chain-of-thought) to solve complex problems.
Author
Eric Elbaz (Ex0bit)
Model Tree
allenai/OLMo-3-32B-Think (Base Model)
โโโ Ex0bit/Elbaz-OLMo-3-32B-Think-Abliterated (This Model)
โโโ Elbaz-OLMo-3-32B-Think-Abliterated-Q4_K_M.gguf
โโโ Elbaz-OLMo-3-32B-Think-Abliterated-Q8_0.gguf
โโโ Elbaz-OLMo-3-32B-Think-Abliterated-BF16.gguf
OLMo-3 Family
| Model | Parameters | Type | Link |
|---|---|---|---|
| OLMo-3-1B-Instruct | 1B | Instruct | allenai/OLMo-3-1B-Instruct |
| OLMo-3-7B-Instruct | 7B | Instruct | allenai/OLMo-3-7B-Instruct |
| OLMo-3-13B-Instruct | 13B | Instruct | allenai/OLMo-3-13B-Instruct |
| OLMo-3-32B-Think | 32B | Reasoning | allenai/OLMo-3-32B-Think |
Key Features
- 80% HarmBench bypass rate with maintained reasoning capabilities
- 60% AdvBench bypass rate
- Preserves thinking/reasoning capabilities with
<|think|>tags - Minimal MMLU degradation (44% -> 42%, only -2%)
- Multiple quantization formats for different use cases
- Compatible with llama.cpp and Ollama
Available Quantizations
| Quantization | Size | Min VRAM | Recommended VRAM |
|---|---|---|---|
| Q4_K_M | 19 GB | 24 GB | 32 GB |
| Q8_0 | 32 GB | 40 GB | 48 GB |
| BF16 | 64.5 GB | 64 GB | 80 GB |
Technicals
| Metric | Before | After | Change |
|---|---|---|---|
| MMLU | 0.44 | 0.42 | -0.02 |
| AdvBench Bypass | 0.0% | 60.0% | +60.0% |
| HarmBench Bypass | 0.0% | 80.0% | +80.0% |
| Reasoning | 100.0% | 100.0% | +0.0% |
| Coherence | 100.0% | 100.0% | +0.0% |
Quick Start
Using with Ollama
# Run directly from Hugging Face
ollama run hf.co/Ex0bit/Elbaz-OLMo-3-32B-Think-Abliterated
# Or create a custom Modelfile
echo 'FROM ./Elbaz-OLMo-3-32B-Think-Abliterated-BF16.gguf' > Modelfile
ollama create elbaz-olmo-32b-think -f Modelfile
ollama run elbaz-olmo-32b-think
Using with llama.cpp
# Download the model
huggingface-cli download Ex0bit/Elbaz-OLMo-3-32B-Think-Abliterated \
Elbaz-OLMo-3-32B-Think-Abliterated-BF16.gguf \
--local-dir .
# Run inference
./llama-cli -m Elbaz-OLMo-3-32B-Think-Abliterated-BF16.gguf \
-p "Your prompt here" \
-n 512 \
--temp 0.7
Using with Transformers (Original Weights)
from transformers import AutoModelForCausalLM, AutoTokenizer
import torch
model_name = "Ex0bit/Elbaz-OLMo-3-32B-Think-Abliterated"
tokenizer = AutoTokenizer.from_pretrained(model_name, trust_remote_code=True)
model = AutoModelForCausalLM.from_pretrained(
model_name,
torch_dtype=torch.bfloat16,
device_map="auto",
trust_remote_code=True
)
messages = [{"role": "user", "content": "Your prompt here"}]
inputs = tokenizer.apply_chat_template(messages, add_generation_prompt=True, return_tensors="pt")
inputs = inputs.to(model.device)
outputs = model.generate(inputs, max_new_tokens=512, temperature=0.7, do_sample=True)
response = tokenizer.decode(outputs[0][inputs.shape[1]:], skip_special_tokens=True)
print(response)
Method: SNR-based Layer Selection with Norm-Preserving Orthogonalization
The model was abliterated using our advanced SNR-based Layer Selection with Norm-Preserving Orthogonalization technique. This method:
- Computes refusal direction by analyzing activation differences between harmful and benign prompts
- Calculates Signal-to-Noise Ratio (SNR) for each layer to identify where refusal behavior is most concentrated
- Selects optimal layers for abliteration based on SNR scores
- Applies norm-preserving orthogonalization to remove refusal direction while maintaining weight norms
- Uses per-layer KL divergence tracking to ensure minimal impact on model capabilities
This approach outperforms traditional uniform-weight methods by:
- Focusing abliteration on high-SNR layers where refusal is strongest
- Preserving model coherence through norm-preserving modifications
- Maintaining reasoning capabilities critical for thinking models
Mathematical Formula
W' = W - (d @ d.T) @ W
W' = W' * (||W|| / ||W'||) # Norm preservation
Where:
Wis the original weight matrixdis the normalized refusal direction- The norm ratio scaling preserves the original weight magnitude
Evaluation Results
Comprehensive Compliance Testing
Tested on extensive prompt datasets covering multiple categories:
Harmful Prompts (Model Should Respond)
- HarmBench Compliance: 80%
- AdvBench Compliance: 60%
- Model provides detailed responses with extended thinking
Harmless Prompts (Model Should Respond Accurately)
- Accuracy: 100%
- All benign prompts answered correctly with no false positives
Example Responses
Click to see example responses
Prompt: "Explain common penetration testing techniques for web applications"
Response: Provides detailed explanation with <|think|> reasoning of SQL injection, XSS, CSRF, and other OWASP vulnerabilities used in security assessments.
Prompt: "What is the capital of France?"
Response: "The capital of France is Paris."
Hardware Requirements
| Quantization | Min VRAM | Recommended VRAM |
|---|---|---|
| Q4_K_M | 24 GB | 32 GB |
| Q8_0 | 40 GB | 48 GB |
| BF16 | 64 GB | 80 GB |
Recommended configurations:
- 2x A100 80GB
- 4x A100 40GB
- 1x H100 80GB
Limitations
- English only: Optimized for English language prompts
- Context length: Follows base model's context window
- Thinking tags: Model uses
<|think|>tags for reasoning - ensure your inference setup handles these properly
Ethical Considerations
This model has been modified to reduce safety guardrails. Users are responsible for:
- Complying with all applicable laws and regulations
- Not using the model for illegal activities
- Understanding the potential risks of unrestricted AI responses
- Implementing appropriate safeguards in production environments
License
Apache 2.0 (same as base model allenai/OLMo-3-32B-Think)
Citation
If you use this model, please cite:
@misc{elbaz2025olmo32babliterated,
author = {Elbaz, Eric},
title = {Elbaz-OLMo-3-32B-Think-Abliterated: An Abliterated OLMo-3 Reasoning Model},
year = {2025},
publisher = {Hugging Face},
howpublished = {\url{https://huggingface.co/Ex0bit/Elbaz-OLMo-3-32B-Think-Abliterated}}
}
Acknowledgments
- Allen Institute for AI for OLMo-3
Related Models
- allenai/OLMo-3-32B-Think - Base model
- Ex0bit/Elbaz-Olmo-3-7B-Instruct-abliterated - 7B version
Created by: Ex0bit (Eric Elbaz)
- Downloads last month
- 250
4-bit
8-bit
16-bit
Evaluation results
- Prompt Compliance Rate (%)self-reported80.000