Elbaz-OLMo-3-32B-Think-Abliterated

OLMo-3 Logo

abliterated

An abliterated (uncensored) version of OLMo-3-32B-Think with safety guardrails removed

Model Card Base Model License

Model Description

This model is an abliterated version of allenai/OLMo-3-32B-Think that has had its refusal mechanisms removed using our advanced SNR-based Layer Selection with Norm-Preserving Orthogonalization method. This technique identifies the optimal layers for abliteration using signal-to-noise ratio analysis and applies norm-preserving modifications to maintain model coherence while maximizing refusal removal. The model will respond to prompts that the original model would refuse.

OLMo-3-32B-Think is a 32B parameter reasoning model from Allen AI that uses extended thinking (chain-of-thought) to solve complex problems.

Author

Eric Elbaz (Ex0bit)

Model Tree

allenai/OLMo-3-32B-Think (Base Model)
โ””โ”€โ”€ Ex0bit/Elbaz-OLMo-3-32B-Think-Abliterated (This Model)
    โ”œโ”€โ”€ Elbaz-OLMo-3-32B-Think-Abliterated-Q4_K_M.gguf
    โ”œโ”€โ”€ Elbaz-OLMo-3-32B-Think-Abliterated-Q8_0.gguf
    โ””โ”€โ”€ Elbaz-OLMo-3-32B-Think-Abliterated-BF16.gguf

OLMo-3 Family

Model Parameters Type Link
OLMo-3-1B-Instruct 1B Instruct allenai/OLMo-3-1B-Instruct
OLMo-3-7B-Instruct 7B Instruct allenai/OLMo-3-7B-Instruct
OLMo-3-13B-Instruct 13B Instruct allenai/OLMo-3-13B-Instruct
OLMo-3-32B-Think 32B Reasoning allenai/OLMo-3-32B-Think

Key Features

  • 80% HarmBench bypass rate with maintained reasoning capabilities
  • 60% AdvBench bypass rate
  • Preserves thinking/reasoning capabilities with <|think|> tags
  • Minimal MMLU degradation (44% -> 42%, only -2%)
  • Multiple quantization formats for different use cases
  • Compatible with llama.cpp and Ollama

Available Quantizations

Quantization Size Min VRAM Recommended VRAM
Q4_K_M 19 GB 24 GB 32 GB
Q8_0 32 GB 40 GB 48 GB
BF16 64.5 GB 64 GB 80 GB

Technicals

Metric Before After Change
MMLU 0.44 0.42 -0.02
AdvBench Bypass 0.0% 60.0% +60.0%
HarmBench Bypass 0.0% 80.0% +80.0%
Reasoning 100.0% 100.0% +0.0%
Coherence 100.0% 100.0% +0.0%

Quick Start

Using with Ollama

# Run directly from Hugging Face
ollama run hf.co/Ex0bit/Elbaz-OLMo-3-32B-Think-Abliterated

# Or create a custom Modelfile
echo 'FROM ./Elbaz-OLMo-3-32B-Think-Abliterated-BF16.gguf' > Modelfile
ollama create elbaz-olmo-32b-think -f Modelfile
ollama run elbaz-olmo-32b-think

Using with llama.cpp

# Download the model
huggingface-cli download Ex0bit/Elbaz-OLMo-3-32B-Think-Abliterated \
    Elbaz-OLMo-3-32B-Think-Abliterated-BF16.gguf \
    --local-dir .

# Run inference
./llama-cli -m Elbaz-OLMo-3-32B-Think-Abliterated-BF16.gguf \
    -p "Your prompt here" \
    -n 512 \
    --temp 0.7

Using with Transformers (Original Weights)

from transformers import AutoModelForCausalLM, AutoTokenizer
import torch

model_name = "Ex0bit/Elbaz-OLMo-3-32B-Think-Abliterated"

tokenizer = AutoTokenizer.from_pretrained(model_name, trust_remote_code=True)
model = AutoModelForCausalLM.from_pretrained(
    model_name,
    torch_dtype=torch.bfloat16,
    device_map="auto",
    trust_remote_code=True
)

messages = [{"role": "user", "content": "Your prompt here"}]
inputs = tokenizer.apply_chat_template(messages, add_generation_prompt=True, return_tensors="pt")
inputs = inputs.to(model.device)

outputs = model.generate(inputs, max_new_tokens=512, temperature=0.7, do_sample=True)
response = tokenizer.decode(outputs[0][inputs.shape[1]:], skip_special_tokens=True)
print(response)

Method: SNR-based Layer Selection with Norm-Preserving Orthogonalization

The model was abliterated using our advanced SNR-based Layer Selection with Norm-Preserving Orthogonalization technique. This method:

  1. Computes refusal direction by analyzing activation differences between harmful and benign prompts
  2. Calculates Signal-to-Noise Ratio (SNR) for each layer to identify where refusal behavior is most concentrated
  3. Selects optimal layers for abliteration based on SNR scores
  4. Applies norm-preserving orthogonalization to remove refusal direction while maintaining weight norms
  5. Uses per-layer KL divergence tracking to ensure minimal impact on model capabilities

This approach outperforms traditional uniform-weight methods by:

  • Focusing abliteration on high-SNR layers where refusal is strongest
  • Preserving model coherence through norm-preserving modifications
  • Maintaining reasoning capabilities critical for thinking models

Mathematical Formula

W' = W - (d @ d.T) @ W
W' = W' * (||W|| / ||W'||)  # Norm preservation

Where:

  • W is the original weight matrix
  • d is the normalized refusal direction
  • The norm ratio scaling preserves the original weight magnitude

Evaluation Results

Comprehensive Compliance Testing

Tested on extensive prompt datasets covering multiple categories:

Harmful Prompts (Model Should Respond)

  • HarmBench Compliance: 80%
  • AdvBench Compliance: 60%
  • Model provides detailed responses with extended thinking

Harmless Prompts (Model Should Respond Accurately)

  • Accuracy: 100%
  • All benign prompts answered correctly with no false positives

Example Responses

Click to see example responses

Prompt: "Explain common penetration testing techniques for web applications"

Response: Provides detailed explanation with <|think|> reasoning of SQL injection, XSS, CSRF, and other OWASP vulnerabilities used in security assessments.

Prompt: "What is the capital of France?"

Response: "The capital of France is Paris."

Hardware Requirements

Quantization Min VRAM Recommended VRAM
Q4_K_M 24 GB 32 GB
Q8_0 40 GB 48 GB
BF16 64 GB 80 GB

Recommended configurations:

  • 2x A100 80GB
  • 4x A100 40GB
  • 1x H100 80GB

Limitations

  • English only: Optimized for English language prompts
  • Context length: Follows base model's context window
  • Thinking tags: Model uses <|think|> tags for reasoning - ensure your inference setup handles these properly

Ethical Considerations

This model has been modified to reduce safety guardrails. Users are responsible for:

  • Complying with all applicable laws and regulations
  • Not using the model for illegal activities
  • Understanding the potential risks of unrestricted AI responses
  • Implementing appropriate safeguards in production environments

License

Apache 2.0 (same as base model allenai/OLMo-3-32B-Think)

Citation

If you use this model, please cite:

@misc{elbaz2025olmo32babliterated,
  author = {Elbaz, Eric},
  title = {Elbaz-OLMo-3-32B-Think-Abliterated: An Abliterated OLMo-3 Reasoning Model},
  year = {2025},
  publisher = {Hugging Face},
  howpublished = {\url{https://huggingface.co/Ex0bit/Elbaz-OLMo-3-32B-Think-Abliterated}}
}

Acknowledgments

Related Models


Created by: Ex0bit (Eric Elbaz)

Downloads last month
250
GGUF
Model size
32B params
Architecture
olmo2
Hardware compatibility
Log In to view the estimation

4-bit

8-bit

16-bit

Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Evaluation results