You need to agree to share your contact information to access this model

This repository is publicly accessible, but you have to accept the conditions to access its files and content.

Log in or Sign Up to review the conditions and access this model content.

olmo3-7b-slerp

olmo3-7b-slerp is a Mixture of Experts (MoE) made with the following models using LazyMergekit:

🧩 Configuration

slices:
  - sources:
      - model: allenai/Olmo-3-7B
        layer_range: [0, 32]
      - model: allenai/Olmo-3-7B-Instruct
        layer_range: [0, 32]
      - model: allenai/Olmo-3-7B-Think
        layer_range: [0, 32]
      - model: allenai/Olmo-3-7B-Think-SFT
        layer_range: [0, 32]
      - model: allenai/Olmo-3-7B-Think-DPO
        layer_range: [0, 32]
      - model: allenai/Olmo-3-7B-RL-Zero-IF
        layer_range: [0, 32]
      - model: allenai/Olmo-3-7B-RL-Zero-Math
        layer_range: [0, 32]
      - model: allenai/Olmo-3-7B-RL-Zero-Code
        layer_range: [0, 32]
      - model: allenai/Olmo-3-7B-RL-Zero-Mix
base_model: allenai/Olmo-3-7B
experts:
  - source_model: allenai/Olmo-3-7B
    weight: 0.2
  - source_model: allenai/Olmo-3-7B-Instruct
    weight: 0.1
  - source_model: allenai/Olmo-3-7B-Think
    weight: 0.1
  - source_model: allenai/Olmo-3-7B-Think-SFT
    weight: 0.1
  - source_model: allenai/Olmo-3-7B-Think-DPO
    weight: 0.1
  - source_model: allenai/Olmo-3-7B-RL-Zero-IF
    weight: 0.1
  - source_model: allenai/Olmo-3-7B-RL-Zero-Math
    weight: 0.1
  - source_model: allenai/Olmo-3-7B-RL-Zero-Code
    weight: 0.1
  - source_model: allenai/Olmo-3-7B-RL-Zero-Mix
    weight: 0.1
parameters:
  t:
    - filter: self_attn
      value: [0, 0.5, 0.3, 0.7, 1]
    - filter: mlp
      value: [1, 0.5, 0.7, 0.3, 0]
    - value: 0.5
merge_type: slerp
dtype: bfloat16
layer_range: [0, 32]

πŸ’» Usage

!pip install -qU transformers bitsandbytes accelerate

from transformers import AutoTokenizer
import transformers
import torch

model = "jsuheb/olmo3-7b-slerp"

tokenizer = AutoTokenizer.from_pretrained(model)
pipeline = transformers.pipeline(
    "text-generation",
    model=model,
    model_kwargs={"torch_dtype": torch.float16, "load_in_4bit": True},
)

messages = [{"role": "user", "content": "Explain what a Mixture of Experts is in less than 100 words."}]
prompt = pipeline.tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
outputs = pipeline(prompt, max_new_tokens=256, do_sample=True, temperature=0.7, top_k=50, top_p=0.95)
print(outputs[0]["generated_text"])
Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support

Model tree for jsuheb/olmo3-7b-lazymerge-slerp